[Bioc-devel] depends on packages providing classes
Vincent Carey
stvjc at channing.harvard.edu
Wed Oct 29 04:51:10 CET 2014
On Tue, Oct 28, 2014 at 5:48 PM, Hervé Pagès <hpages at fredhutch.org> wrote:
>
>
> On 10/28/2014 12:42 PM, Vincent Carey wrote:
>
>>
>>
>> On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès <hpages at fredhutch.org
>> <mailto:hpages at fredhutch.org>> wrote:
>>
>> Hi,
>>
>> On 10/28/2014 08:48 AM, Vincent Carey wrote:
>>
>> On Tue, Oct 28, 2014 at 11:23 AM, Kasper Daniel Hansen <
>> kasperdanielhansen at gmail.com
>> <mailto:kasperdanielhansen at gmail.com>> wrote:
>>
>> Well, first I want to make sure that there is not something
>> special
>> regarding S4 methods and classes. I have a feeling that they
>> are a special
>> case.
>>
>> Second, while I agree with Jim's general opinion, it is a
>> little bit
>> different when I have return objects which are defined in
>> other packages.
>> If I don't depend on this other package, the user is hosed
>> wrt. the return
>> object, unless I manually export all classes from this other
>>
>>
>> In what sense? If you return an instance of GRanges, certain
>> things can be
>> done
>> even if GenomicRanges is not attached.
>>
>>
>> Yes certain things maybe, but it's hard to predict which ones.
>>
>> You can get values of slots, for
>> example.
>>
>> With the following little package
>>
>> %vjcair> cat foo/NAMESPACE
>>
>> importFrom(IRanges, IRanges)
>>
>> importClassesFrom(__GenomicRanges, GRanges)
>>
>> importFrom(GenomicRanges, GRanges)
>>
>> export(myfun)
>>
>>
>>
>> %vjcair> cat foo/DESCRIPTION
>>
>> Package: foo
>>
>> Title: foo
>>
>> Version: 0.0.0
>>
>> Author: VJ Carey <stvjc at channing.harvard.edu
>> <mailto:stvjc at channing.harvard.edu>>
>>
>> Description:
>>
>> Suggests:
>>
>> Depends:
>>
>> Imports: GenomicRanges
>>
>> Maintainer: VJ Carey <stvjc at channing.harvard.edu
>> <mailto:stvjc at channing.harvard.edu>>
>>
>>
>> License: Private
>>
>> LazyLoad: yes
>>
>>
>>
>> %vjcair> cat foo/R/*
>>
>> myfun = function(seqnames="1", ranges=IRanges(1,2), ...)
>>
>> GRanges(seqnames=seqnames, ranges=ranges, ...)
>>
>>
>> The following works:
>>
>>
>> library(foo)
>>
>>
>> x = myfun()
>>
>>
>> x
>>
>>
>> GRanges object with 1 range and 0 metadata columns:
>>
>> seqnames ranges strand
>>
>> <Rle> <IRanges> <Rle>
>>
>> [1] 1 [1, 2] *
>>
>> -------
>>
>> seqinfo: 1 sequence from an unspecified genome; no seqlengths
>>
>>
>> So the show method works, even though I have not touched it. (I
>> did not
>>
>> expect it to work, in fact.)
>>
>>
>> Exactly. Let's call it luck ;-)
>>
>> Additionally, I can get access to slots.
>>
>>
>> The end user should never try to access slots directly but use getters
>> and setters instead. And most getters and setters for GRanges objects
>> are defined and documented in the GenomicRanges package. Those that
>> are
>> not are defined in packages that GenomicRanges depends on.
>>
>> But
>> ranges()
>>
>> fails. If I, the user, want to use it, I need to arrange for
>> that.
>>
>>
>> IMO if your package returns a GRanges object to the user, then the
>> user
>> should be able to access the man page for GRanges objects with
>> ?GRanges.
>>
>>
>> Oddly enough, that seems to be incorrect. I added a man page to foo
>> that has
>> a \link[GenomicRanges]{GRanges-class}. I ran help.start and the cross
>> reference
>> from my man page succeeds. Furthermore with the sessionInfo below,
>> ?GRanges
>> succeeds at the CLI.
>>
>
> Did you try to run example(GRanges)? I'm not sure that will work.
>
Correct. Cursory look at source shows that help() uses loadedNamespaces()
to find the help file. example() could probably do likewise.
>
> For example after I do library(rtracklayer), I can indeed do
> ?DNAStringSet at the command line (I'm surprised this works), but
> then example(DNAStringSet) fails:
>
> > example(DNAStringSet)
> Warning message:
> In example(DNAStringSet) : no help found for ‘DNAStringSet’
>
> I'm also surprised this is just a warning but that's another story...
>
> H.
>
> I am not trying to defend the NOTE but the
>> principle of minimizing
>> Depends declarations needs to be considered critically, and I am just
>> exploring the space.
>>
>> > ?GRanges # it worked as usual in the tty
>>
>> > sessionInfo()
>>
>> R version 3.1.1 (2014-07-10)
>>
>> Platform: x86_64-apple-darwin13.1.0 (64-bit)
>>
>>
>> locale:
>>
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>>
>> attached base packages:
>>
>> [1] stats graphics grDevices datasets utils tools methods
>>
>> [8] base
>>
>>
>> other attached packages:
>>
>> [1] foo_0.0.0 rmarkdown_0.3.8 knitr_1.6
>>
>> [4] weaver_1.31.0 codetools_0.2-9 digest_0.6.4
>>
>> [7] BiocInstaller_1.16.0
>>
>>
>> loaded via a namespace (and not attached):
>>
>> [1] BiocGenerics_0.11.5 evaluate_0.5.5 formatR_1.0
>>
>> [4] GenomeInfoDb_1.1.26 GenomicRanges_1.17.48 htmltools_0.2.6
>>
>> [7] IRanges_1.99.32 parallel_3.1.1 S4Vectors_0.2.8
>>
>> [10] stats4_3.1.1 stringr_0.6.2 XVector_0.5.8
>>
>> And that works only if the GenomicRanges package is attached.
>> Attaching
>> GenomicRanges will also attach other packages that GenomicRanges
>> depends
>> on where some GRanges accessors might be defined and documented (e.g.
>> metadata()).
>>
>>
>>
>> In some cases you'll decide you want the user to have a full
>> complement of
>>
>> methods for your package to function meaningfully. For example,
>> I am
>> considering
>>
>> using dplyr idioms to work with data structures in a package,
>> and it seems
>> I should
>>
>> just depend on dplyr rather than pick out and document which
>> things I want
>> to expose. But that
>>
>> may still be an undesirable design.
>>
>>
>> package, like
>> importClassesFrom("__GenomicRanges", "GRanges")
>>
>> exportClasses("GRanges")
>> Surely that is not intended.
>>
>> It is important that my package works without being attached
>> to the search
>> path and I do this by carefully importing what I need, ie.
>> my code does not
>> require that my dependencies are attached to the search
>> path. But the end
>> user will be hosed without it.
>>
>>
>> Yes s/he will. Fortunately when your package namespace gets loaded by
>> another package, then nothing gets attached to the search path, even
>> if
>> your package depends (instead of imports) on other packages. So using
>> Depends instead of Imports for your own dependencies won't make any
>> difference in that respect, which is good.
>>
>>
>> My impression is that the NOTE in R CMD check was written by
>> someone who
>> did not anticipate large-scale use and re-use of classes and
>> methods across
>> many packages.
>>
>>
>> That's my impression too.
>>
>> Cheers,
>> H.
>>
>>
>> Best,
>> Kasper
>>
>>
>> On Tue, Oct 28, 2014 at 11:14 AM, James W. MacDonald
>> <jmacdon at uw.edu <mailto:jmacdon at uw.edu>>
>> wrote:
>>
>> I agree with Vince. It's your job as a package developer
>> to make
>> available to your package all the functions necessary
>> for the package to
>> work. But I am not sure it is your job to load all the
>> packages that your
>> end user might need.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> On Tue, Oct 28, 2014 at 11:04 AM, Vincent Carey <
>> stvjc at channing.harvard.edu
>> <mailto:stvjc at channing.harvard.edu>> wrote:
>>
>> On Tue, Oct 28, 2014 at 10:19 AM, Kasper Daniel
>> Hansen <
>> kasperdanielhansen at gmail.com
>> <mailto:kasperdanielhansen at gmail.com>> wrote:
>>
>> What is the current best paradigm for using all
>> the classes in
>> S4Vectors/GenomeInfoDb/__GenomicRanges/IRanges
>>
>>
>> I obviously import methods and classes from the
>> relevant packages.
>>
>> But shouldn't I depend on these packages as
>> well? Since I basically
>>
>> want
>>
>> the user to have this functionality at the
>> command line? That is what
>>
>> I do
>>
>> now.
>>
>>
>> I've wondered about this as well. It seems the
>> principle is that the
>> user
>> should
>> take care of attaching additional packages when
>> needed. It might be
>> appropriate
>> to give a hint in the package startup message, if
>> having some other
>> package
>> attached
>> would typically be of great utility.
>>
>> Given your list above, I would think that depending
>> on GenomicRanges
>> would
>> often
>> be sufficient, and IRanges/S4Vectors would not
>> require dependency
>> assertion. I would
>> think that GenomeInfoDb should be a voluntary
>> attachment for a specific
>> session.
>>
>> These are just my guesses -- I doubt there will be
>> complete consensus,
>> but
>> I have
>> started to think very critically about using
>> Depends, and I think it is
>> better when its
>> use is minimized.
>>
>>
>> That of course leads to the R CMD check NOTE on
>> depending on too many
>> packages.... I guess I should ignore that one.
>>
>> Best,
>> Kasper
>>
>> [[alternative HTML version deleted]]
>>
>> _________________________________________________
>> Bioc-devel at r-project.org
>> <mailto:Bioc-devel at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/_
>> _listinfo/bioc-devel
>> <https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >
>>
>>
>> [[alternative HTML version deleted]]
>>
>> _________________________________________________
>> Bioc-devel at r-project.org
>> <mailto:Bioc-devel at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>
>>
>>
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> _________________________________________________
>> Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>> mailing list
>> https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>>
>> Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>> Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
>
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list