[Bioc-devel] depends on packages providing classes

Vincent Carey stvjc at channing.harvard.edu
Wed Oct 29 04:51:10 CET 2014


On Tue, Oct 28, 2014 at 5:48 PM, Hervé Pagès <hpages at fredhutch.org> wrote:

>
>
> On 10/28/2014 12:42 PM, Vincent Carey wrote:
>
>>
>>
>> On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès <hpages at fredhutch.org
>> <mailto:hpages at fredhutch.org>> wrote:
>>
>>     Hi,
>>
>>     On 10/28/2014 08:48 AM, Vincent Carey wrote:
>>
>>         On Tue, Oct 28, 2014 at 11:23 AM, Kasper Daniel Hansen <
>>         kasperdanielhansen at gmail.com
>>         <mailto:kasperdanielhansen at gmail.com>> wrote:
>>
>>             Well, first I want to make sure that there is not something
>>             special
>>             regarding S4 methods and classes. I have a feeling that they
>>             are a special
>>             case.
>>
>>             Second, while I agree with Jim's general opinion, it is a
>>             little bit
>>             different when I have return objects which are defined in
>>             other packages.
>>             If I don't depend on this other package, the user is hosed
>>             wrt. the return
>>             object, unless I manually export all classes from this other
>>
>>
>>         In what sense?  If you return an instance of GRanges, certain
>>         things can be
>>         done
>>         even if GenomicRanges is not attached.
>>
>>
>>     Yes certain things maybe, but it's hard to predict which ones.
>>
>>           You can get values of slots, for
>>         example.
>>
>>         With the following little package
>>
>>         %vjcair> cat foo/NAMESPACE
>>
>>         importFrom(IRanges, IRanges)
>>
>>         importClassesFrom(__GenomicRanges, GRanges)
>>
>>         importFrom(GenomicRanges, GRanges)
>>
>>         export(myfun)
>>
>>
>>
>>         %vjcair> cat foo/DESCRIPTION
>>
>>         Package: foo
>>
>>         Title: foo
>>
>>         Version: 0.0.0
>>
>>         Author: VJ Carey <stvjc at channing.harvard.edu
>>         <mailto:stvjc at channing.harvard.edu>>
>>
>>         Description:
>>
>>         Suggests:
>>
>>         Depends:
>>
>>         Imports: GenomicRanges
>>
>>         Maintainer: VJ Carey <stvjc at channing.harvard.edu
>>         <mailto:stvjc at channing.harvard.edu>>
>>
>>
>>         License: Private
>>
>>         LazyLoad: yes
>>
>>
>>
>>         %vjcair> cat foo/R/*
>>
>>         myfun = function(seqnames="1", ranges=IRanges(1,2), ...)
>>
>>              GRanges(seqnames=seqnames, ranges=ranges, ...)
>>
>>
>>         The following works:
>>
>>
>>             library(foo)
>>
>>
>>             x = myfun()
>>
>>
>>             x
>>
>>
>>         GRanges object with 1 range and 0 metadata columns:
>>
>>                 seqnames    ranges strand
>>
>>                    <Rle> <IRanges>  <Rle>
>>
>>             [1]        1    [1, 2]      *
>>
>>             -------
>>
>>             seqinfo: 1 sequence from an unspecified genome; no seqlengths
>>
>>
>>         So the show method works, even though I have not touched it.  (I
>>         did not
>>
>>         expect it to work, in fact.)
>>
>>
>>     Exactly. Let's call it luck ;-)
>>
>>           Additionally, I can get access to slots.
>>
>>
>>     The end user should never try to access slots directly but use getters
>>     and setters instead. And most getters and setters for GRanges objects
>>     are defined and documented in the GenomicRanges package. Those that
>> are
>>     not are defined in packages that GenomicRanges depends on.
>>
>>           But
>>         ranges()
>>
>>         fails.  If I, the user, want to use it, I need to arrange for
>> that.
>>
>>
>>     IMO if your package returns a GRanges object to the user, then the
>> user
>>     should be able to access the man page for GRanges objects with
>> ?GRanges.
>>
>>
>> Oddly enough, that seems to be incorrect.  I added a man page to foo
>> that has
>> a \link[GenomicRanges]{GRanges-class}.  I ran help.start and the cross
>> reference
>> from my man page succeeds.  Furthermore with the sessionInfo below,
>> ?GRanges
>> succeeds at the CLI.
>>
>
> Did you try to run example(GRanges)? I'm not sure that will work.
>

Correct.  Cursory look at source shows that help() uses loadedNamespaces()
to find the help file.  example() could probably do likewise.


>
> For example after I do library(rtracklayer), I can indeed do
> ?DNAStringSet at the command line (I'm surprised this works), but
> then example(DNAStringSet) fails:
>
>   > example(DNAStringSet)
>   Warning message:
>   In example(DNAStringSet) : no help found for ‘DNAStringSet’
>
> I'm also surprised this is just a warning but that's another story...
>
> H.
>
>   I am not trying to defend the NOTE but the
>> principle of minimizing
>> Depends declarations needs to be considered critically, and I am just
>> exploring the space.
>>
>>  > ?GRanges  # it worked as usual in the tty
>>
>>  > sessionInfo()
>>
>> R version 3.1.1 (2014-07-10)
>>
>> Platform: x86_64-apple-darwin13.1.0 (64-bit)
>>
>>
>> locale:
>>
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>>
>> attached base packages:
>>
>> [1] stats     graphics  grDevices datasets  utils     tools     methods
>>
>> [8] base
>>
>>
>> other attached packages:
>>
>> [1] foo_0.0.0            rmarkdown_0.3.8      knitr_1.6
>>
>> [4] weaver_1.31.0        codetools_0.2-9      digest_0.6.4
>>
>> [7] BiocInstaller_1.16.0
>>
>>
>> loaded via a namespace (and not attached):
>>
>>   [1] BiocGenerics_0.11.5   evaluate_0.5.5        formatR_1.0
>>
>>   [4] GenomeInfoDb_1.1.26   GenomicRanges_1.17.48 htmltools_0.2.6
>>
>>   [7] IRanges_1.99.32       parallel_3.1.1        S4Vectors_0.2.8
>>
>> [10] stats4_3.1.1          stringr_0.6.2         XVector_0.5.8
>>
>>     And that works only if the GenomicRanges package is attached.
>> Attaching
>>     GenomicRanges will also attach other packages that GenomicRanges
>> depends
>>     on where some GRanges accessors might be defined and documented (e.g.
>>     metadata()).
>>
>>
>>
>>         In some cases you'll decide you want the user to have a full
>>         complement of
>>
>>         methods for your package to function meaningfully.  For example,
>>         I am
>>         considering
>>
>>         using dplyr idioms to work with data structures in a package,
>>         and it seems
>>         I should
>>
>>         just depend on dplyr rather than pick out and document which
>>         things I want
>>         to expose.  But that
>>
>>         may still be an undesirable design.
>>
>>
>>             package, like
>>                 importClassesFrom("__GenomicRanges", "GRanges")
>>
>>                 exportClasses("GRanges")
>>             Surely that is not intended.
>>
>>             It is important that my package works without being attached
>>             to the search
>>             path and I do this by carefully importing what I need, ie.
>>             my code does not
>>             require that my dependencies are attached to the search
>>             path.  But the end
>>             user will be hosed without it.
>>
>>
>>     Yes s/he will. Fortunately when your package namespace gets loaded by
>>     another package, then nothing gets attached to the search path, even
>> if
>>     your package depends (instead of imports) on other packages. So using
>>     Depends instead of Imports for your own dependencies won't make any
>>     difference in that respect, which is good.
>>
>>
>>             My impression is that the NOTE in R CMD check was written by
>>             someone who
>>             did not anticipate large-scale use and re-use of classes and
>>             methods across
>>             many packages.
>>
>>
>>     That's my impression too.
>>
>>     Cheers,
>>     H.
>>
>>
>>             Best,
>>             Kasper
>>
>>
>>             On Tue, Oct 28, 2014 at 11:14 AM, James W. MacDonald
>>             <jmacdon at uw.edu <mailto:jmacdon at uw.edu>>
>>             wrote:
>>
>>                 I agree with Vince. It's your job as a package developer
>>                 to make
>>                 available to your package all the functions necessary
>>                 for the package to
>>                 work. But I am not sure it is your job to load all the
>>                 packages that your
>>                 end user might need.
>>
>>                 Best,
>>
>>                 Jim
>>
>>
>>
>>                 On Tue, Oct 28, 2014 at 11:04 AM, Vincent Carey <
>>                 stvjc at channing.harvard.edu
>>                 <mailto:stvjc at channing.harvard.edu>> wrote:
>>
>>                     On Tue, Oct 28, 2014 at 10:19 AM, Kasper Daniel
>> Hansen <
>>                     kasperdanielhansen at gmail.com
>>                     <mailto:kasperdanielhansen at gmail.com>> wrote:
>>
>>                         What is the current best paradigm for using all
>>                         the classes in
>>                         S4Vectors/GenomeInfoDb/__GenomicRanges/IRanges
>>
>>
>>                         I obviously import methods and classes from the
>>                         relevant packages.
>>
>>                         But shouldn't I depend on these packages as
>>                         well?  Since I basically
>>
>>                     want
>>
>>                         the user to have this functionality at the
>>                         command line? That is what
>>
>>                     I do
>>
>>                         now.
>>
>>
>>                     I've wondered about this as well.  It seems the
>>                     principle is that the
>>                     user
>>                     should
>>                     take care of attaching additional packages when
>>                     needed.  It might be
>>                     appropriate
>>                     to give a hint in the package startup message, if
>>                     having some other
>>                     package
>>                     attached
>>                     would typically be of great utility.
>>
>>                     Given your list above, I would think that depending
>>                     on GenomicRanges
>>                     would
>>                     often
>>                     be sufficient, and IRanges/S4Vectors would not
>>                     require dependency
>>                     assertion.  I would
>>                     think that GenomeInfoDb should be a voluntary
>>                     attachment for a specific
>>                     session.
>>
>>                     These are just my guesses -- I doubt there will be
>>                     complete consensus,
>>                     but
>>                     I have
>>                     started to think very critically about using
>>                     Depends, and I think it is
>>                     better when its
>>                     use is minimized.
>>
>>
>>                         That of course leads to the R CMD check NOTE on
>>                         depending on too many
>>                         packages.... I guess I should ignore that one.
>>
>>                         Best,
>>                         Kasper
>>
>>                                   [[alternative HTML version deleted]]
>>
>>                         _________________________________________________
>>                         Bioc-devel at r-project.org
>>                         <mailto:Bioc-devel at r-project.org> mailing list
>>                         https://stat.ethz.ch/mailman/_
>> _listinfo/bioc-devel
>>                         <https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >
>>
>>
>>                               [[alternative HTML version deleted]]
>>
>>                     _________________________________________________
>>                     Bioc-devel at r-project.org
>>                     <mailto:Bioc-devel at r-project.org> mailing list
>>                     https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>                     <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>
>>
>>
>>
>>                 --
>>                 James W. MacDonald, M.S.
>>                 Biostatistician
>>                 University of Washington
>>                 Environmental and Occupational Health Sciences
>>                 4225 Roosevelt Way NE, # 100
>>                 Seattle WA 98105-6099
>>
>>
>>
>>
>>                  [[alternative HTML version deleted]]
>>
>>         _________________________________________________
>>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>         mailing list
>>         https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>
>>
>>     --
>>     Hervé Pagès
>>
>>     Program in Computational Biology
>>     Division of Public Health Sciences
>>     Fred Hutchinson Cancer Research Center
>>     1100 Fairview Ave. N, M1-B514
>>     P.O. Box 19024
>>     Seattle, WA 98109-1024
>>
>>     E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>>
>>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list