[Bioc-devel] depends on packages providing classes

Hervé Pagès hpages at fredhutch.org
Tue Oct 28 22:48:17 CET 2014



On 10/28/2014 12:42 PM, Vincent Carey wrote:
>
>
> On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès <hpages at fredhutch.org
> <mailto:hpages at fredhutch.org>> wrote:
>
>     Hi,
>
>     On 10/28/2014 08:48 AM, Vincent Carey wrote:
>
>         On Tue, Oct 28, 2014 at 11:23 AM, Kasper Daniel Hansen <
>         kasperdanielhansen at gmail.com
>         <mailto:kasperdanielhansen at gmail.com>> wrote:
>
>             Well, first I want to make sure that there is not something
>             special
>             regarding S4 methods and classes. I have a feeling that they
>             are a special
>             case.
>
>             Second, while I agree with Jim's general opinion, it is a
>             little bit
>             different when I have return objects which are defined in
>             other packages.
>             If I don't depend on this other package, the user is hosed
>             wrt. the return
>             object, unless I manually export all classes from this other
>
>
>         In what sense?  If you return an instance of GRanges, certain
>         things can be
>         done
>         even if GenomicRanges is not attached.
>
>
>     Yes certain things maybe, but it's hard to predict which ones.
>
>           You can get values of slots, for
>         example.
>
>         With the following little package
>
>         %vjcair> cat foo/NAMESPACE
>
>         importFrom(IRanges, IRanges)
>
>         importClassesFrom(__GenomicRanges, GRanges)
>
>         importFrom(GenomicRanges, GRanges)
>
>         export(myfun)
>
>
>
>         %vjcair> cat foo/DESCRIPTION
>
>         Package: foo
>
>         Title: foo
>
>         Version: 0.0.0
>
>         Author: VJ Carey <stvjc at channing.harvard.edu
>         <mailto:stvjc at channing.harvard.edu>>
>
>         Description:
>
>         Suggests:
>
>         Depends:
>
>         Imports: GenomicRanges
>
>         Maintainer: VJ Carey <stvjc at channing.harvard.edu
>         <mailto:stvjc at channing.harvard.edu>>
>
>         License: Private
>
>         LazyLoad: yes
>
>
>
>         %vjcair> cat foo/R/*
>
>         myfun = function(seqnames="1", ranges=IRanges(1,2), ...)
>
>              GRanges(seqnames=seqnames, ranges=ranges, ...)
>
>
>         The following works:
>
>
>             library(foo)
>
>
>             x = myfun()
>
>
>             x
>
>
>         GRanges object with 1 range and 0 metadata columns:
>
>                 seqnames    ranges strand
>
>                    <Rle> <IRanges>  <Rle>
>
>             [1]        1    [1, 2]      *
>
>             -------
>
>             seqinfo: 1 sequence from an unspecified genome; no seqlengths
>
>
>         So the show method works, even though I have not touched it.  (I
>         did not
>
>         expect it to work, in fact.)
>
>
>     Exactly. Let's call it luck ;-)
>
>           Additionally, I can get access to slots.
>
>
>     The end user should never try to access slots directly but use getters
>     and setters instead. And most getters and setters for GRanges objects
>     are defined and documented in the GenomicRanges package. Those that are
>     not are defined in packages that GenomicRanges depends on.
>
>           But
>         ranges()
>
>         fails.  If I, the user, want to use it, I need to arrange for that.
>
>
>     IMO if your package returns a GRanges object to the user, then the user
>     should be able to access the man page for GRanges objects with ?GRanges.
>
>
> Oddly enough, that seems to be incorrect.  I added a man page to foo
> that has
> a \link[GenomicRanges]{GRanges-class}.  I ran help.start and the cross
> reference
> from my man page succeeds.  Furthermore with the sessionInfo below, ?GRanges
> succeeds at the CLI.

Did you try to run example(GRanges)? I'm not sure that will work.

For example after I do library(rtracklayer), I can indeed do
?DNAStringSet at the command line (I'm surprised this works), but
then example(DNAStringSet) fails:

   > example(DNAStringSet)
   Warning message:
   In example(DNAStringSet) : no help found for ‘DNAStringSet’

I'm also surprised this is just a warning but that's another story...

H.

>  I am not trying to defend the NOTE but the
> principle of minimizing
> Depends declarations needs to be considered critically, and I am just
> exploring the space.
>
>  > ?GRanges  # it worked as usual in the tty
>
>  > sessionInfo()
>
> R version 3.1.1 (2014-07-10)
>
> Platform: x86_64-apple-darwin13.1.0 (64-bit)
>
>
> locale:
>
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
>
> attached base packages:
>
> [1] stats     graphics  grDevices datasets  utils     tools     methods
>
> [8] base
>
>
> other attached packages:
>
> [1] foo_0.0.0            rmarkdown_0.3.8      knitr_1.6
>
> [4] weaver_1.31.0        codetools_0.2-9      digest_0.6.4
>
> [7] BiocInstaller_1.16.0
>
>
> loaded via a namespace (and not attached):
>
>   [1] BiocGenerics_0.11.5   evaluate_0.5.5        formatR_1.0
>
>   [4] GenomeInfoDb_1.1.26   GenomicRanges_1.17.48 htmltools_0.2.6
>
>   [7] IRanges_1.99.32       parallel_3.1.1        S4Vectors_0.2.8
>
> [10] stats4_3.1.1          stringr_0.6.2         XVector_0.5.8
>
>     And that works only if the GenomicRanges package is attached. Attaching
>     GenomicRanges will also attach other packages that GenomicRanges depends
>     on where some GRanges accessors might be defined and documented (e.g.
>     metadata()).
>
>
>
>         In some cases you'll decide you want the user to have a full
>         complement of
>
>         methods for your package to function meaningfully.  For example,
>         I am
>         considering
>
>         using dplyr idioms to work with data structures in a package,
>         and it seems
>         I should
>
>         just depend on dplyr rather than pick out and document which
>         things I want
>         to expose.  But that
>
>         may still be an undesirable design.
>
>
>             package, like
>                 importClassesFrom("__GenomicRanges", "GRanges")
>                 exportClasses("GRanges")
>             Surely that is not intended.
>
>             It is important that my package works without being attached
>             to the search
>             path and I do this by carefully importing what I need, ie.
>             my code does not
>             require that my dependencies are attached to the search
>             path.  But the end
>             user will be hosed without it.
>
>
>     Yes s/he will. Fortunately when your package namespace gets loaded by
>     another package, then nothing gets attached to the search path, even if
>     your package depends (instead of imports) on other packages. So using
>     Depends instead of Imports for your own dependencies won't make any
>     difference in that respect, which is good.
>
>
>             My impression is that the NOTE in R CMD check was written by
>             someone who
>             did not anticipate large-scale use and re-use of classes and
>             methods across
>             many packages.
>
>
>     That's my impression too.
>
>     Cheers,
>     H.
>
>
>             Best,
>             Kasper
>
>
>             On Tue, Oct 28, 2014 at 11:14 AM, James W. MacDonald
>             <jmacdon at uw.edu <mailto:jmacdon at uw.edu>>
>             wrote:
>
>                 I agree with Vince. It's your job as a package developer
>                 to make
>                 available to your package all the functions necessary
>                 for the package to
>                 work. But I am not sure it is your job to load all the
>                 packages that your
>                 end user might need.
>
>                 Best,
>
>                 Jim
>
>
>
>                 On Tue, Oct 28, 2014 at 11:04 AM, Vincent Carey <
>                 stvjc at channing.harvard.edu
>                 <mailto:stvjc at channing.harvard.edu>> wrote:
>
>                     On Tue, Oct 28, 2014 at 10:19 AM, Kasper Daniel Hansen <
>                     kasperdanielhansen at gmail.com
>                     <mailto:kasperdanielhansen at gmail.com>> wrote:
>
>                         What is the current best paradigm for using all
>                         the classes in
>                         S4Vectors/GenomeInfoDb/__GenomicRanges/IRanges
>
>                         I obviously import methods and classes from the
>                         relevant packages.
>
>                         But shouldn't I depend on these packages as
>                         well?  Since I basically
>
>                     want
>
>                         the user to have this functionality at the
>                         command line? That is what
>
>                     I do
>
>                         now.
>
>
>                     I've wondered about this as well.  It seems the
>                     principle is that the
>                     user
>                     should
>                     take care of attaching additional packages when
>                     needed.  It might be
>                     appropriate
>                     to give a hint in the package startup message, if
>                     having some other
>                     package
>                     attached
>                     would typically be of great utility.
>
>                     Given your list above, I would think that depending
>                     on GenomicRanges
>                     would
>                     often
>                     be sufficient, and IRanges/S4Vectors would not
>                     require dependency
>                     assertion.  I would
>                     think that GenomeInfoDb should be a voluntary
>                     attachment for a specific
>                     session.
>
>                     These are just my guesses -- I doubt there will be
>                     complete consensus,
>                     but
>                     I have
>                     started to think very critically about using
>                     Depends, and I think it is
>                     better when its
>                     use is minimized.
>
>
>                         That of course leads to the R CMD check NOTE on
>                         depending on too many
>                         packages.... I guess I should ignore that one.
>
>                         Best,
>                         Kasper
>
>                                   [[alternative HTML version deleted]]
>
>                         _________________________________________________
>                         Bioc-devel at r-project.org
>                         <mailto:Bioc-devel at r-project.org> mailing list
>                         https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>                         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
>                               [[alternative HTML version deleted]]
>
>                     _________________________________________________
>                     Bioc-devel at r-project.org
>                     <mailto:Bioc-devel at r-project.org> mailing list
>                     https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>                     <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
>
>
>                 --
>                 James W. MacDonald, M.S.
>                 Biostatistician
>                 University of Washington
>                 Environmental and Occupational Health Sciences
>                 4225 Roosevelt Way NE, # 100
>                 Seattle WA 98105-6099
>
>
>
>
>                  [[alternative HTML version deleted]]
>
>         _________________________________________________
>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>         mailing list
>         https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>
>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list