[Bioc-devel] depends on packages providing classes
Hervé Pagès
hpages at fredhutch.org
Tue Oct 28 22:48:17 CET 2014
On 10/28/2014 12:42 PM, Vincent Carey wrote:
>
>
> On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès <hpages at fredhutch.org
> <mailto:hpages at fredhutch.org>> wrote:
>
> Hi,
>
> On 10/28/2014 08:48 AM, Vincent Carey wrote:
>
> On Tue, Oct 28, 2014 at 11:23 AM, Kasper Daniel Hansen <
> kasperdanielhansen at gmail.com
> <mailto:kasperdanielhansen at gmail.com>> wrote:
>
> Well, first I want to make sure that there is not something
> special
> regarding S4 methods and classes. I have a feeling that they
> are a special
> case.
>
> Second, while I agree with Jim's general opinion, it is a
> little bit
> different when I have return objects which are defined in
> other packages.
> If I don't depend on this other package, the user is hosed
> wrt. the return
> object, unless I manually export all classes from this other
>
>
> In what sense? If you return an instance of GRanges, certain
> things can be
> done
> even if GenomicRanges is not attached.
>
>
> Yes certain things maybe, but it's hard to predict which ones.
>
> You can get values of slots, for
> example.
>
> With the following little package
>
> %vjcair> cat foo/NAMESPACE
>
> importFrom(IRanges, IRanges)
>
> importClassesFrom(__GenomicRanges, GRanges)
>
> importFrom(GenomicRanges, GRanges)
>
> export(myfun)
>
>
>
> %vjcair> cat foo/DESCRIPTION
>
> Package: foo
>
> Title: foo
>
> Version: 0.0.0
>
> Author: VJ Carey <stvjc at channing.harvard.edu
> <mailto:stvjc at channing.harvard.edu>>
>
> Description:
>
> Suggests:
>
> Depends:
>
> Imports: GenomicRanges
>
> Maintainer: VJ Carey <stvjc at channing.harvard.edu
> <mailto:stvjc at channing.harvard.edu>>
>
> License: Private
>
> LazyLoad: yes
>
>
>
> %vjcair> cat foo/R/*
>
> myfun = function(seqnames="1", ranges=IRanges(1,2), ...)
>
> GRanges(seqnames=seqnames, ranges=ranges, ...)
>
>
> The following works:
>
>
> library(foo)
>
>
> x = myfun()
>
>
> x
>
>
> GRanges object with 1 range and 0 metadata columns:
>
> seqnames ranges strand
>
> <Rle> <IRanges> <Rle>
>
> [1] 1 [1, 2] *
>
> -------
>
> seqinfo: 1 sequence from an unspecified genome; no seqlengths
>
>
> So the show method works, even though I have not touched it. (I
> did not
>
> expect it to work, in fact.)
>
>
> Exactly. Let's call it luck ;-)
>
> Additionally, I can get access to slots.
>
>
> The end user should never try to access slots directly but use getters
> and setters instead. And most getters and setters for GRanges objects
> are defined and documented in the GenomicRanges package. Those that are
> not are defined in packages that GenomicRanges depends on.
>
> But
> ranges()
>
> fails. If I, the user, want to use it, I need to arrange for that.
>
>
> IMO if your package returns a GRanges object to the user, then the user
> should be able to access the man page for GRanges objects with ?GRanges.
>
>
> Oddly enough, that seems to be incorrect. I added a man page to foo
> that has
> a \link[GenomicRanges]{GRanges-class}. I ran help.start and the cross
> reference
> from my man page succeeds. Furthermore with the sessionInfo below, ?GRanges
> succeeds at the CLI.
Did you try to run example(GRanges)? I'm not sure that will work.
For example after I do library(rtracklayer), I can indeed do
?DNAStringSet at the command line (I'm surprised this works), but
then example(DNAStringSet) fails:
> example(DNAStringSet)
Warning message:
In example(DNAStringSet) : no help found for ‘DNAStringSet’
I'm also surprised this is just a warning but that's another story...
H.
> I am not trying to defend the NOTE but the
> principle of minimizing
> Depends declarations needs to be considered critically, and I am just
> exploring the space.
>
> > ?GRanges # it worked as usual in the tty
>
> > sessionInfo()
>
> R version 3.1.1 (2014-07-10)
>
> Platform: x86_64-apple-darwin13.1.0 (64-bit)
>
>
> locale:
>
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
>
> attached base packages:
>
> [1] stats graphics grDevices datasets utils tools methods
>
> [8] base
>
>
> other attached packages:
>
> [1] foo_0.0.0 rmarkdown_0.3.8 knitr_1.6
>
> [4] weaver_1.31.0 codetools_0.2-9 digest_0.6.4
>
> [7] BiocInstaller_1.16.0
>
>
> loaded via a namespace (and not attached):
>
> [1] BiocGenerics_0.11.5 evaluate_0.5.5 formatR_1.0
>
> [4] GenomeInfoDb_1.1.26 GenomicRanges_1.17.48 htmltools_0.2.6
>
> [7] IRanges_1.99.32 parallel_3.1.1 S4Vectors_0.2.8
>
> [10] stats4_3.1.1 stringr_0.6.2 XVector_0.5.8
>
> And that works only if the GenomicRanges package is attached. Attaching
> GenomicRanges will also attach other packages that GenomicRanges depends
> on where some GRanges accessors might be defined and documented (e.g.
> metadata()).
>
>
>
> In some cases you'll decide you want the user to have a full
> complement of
>
> methods for your package to function meaningfully. For example,
> I am
> considering
>
> using dplyr idioms to work with data structures in a package,
> and it seems
> I should
>
> just depend on dplyr rather than pick out and document which
> things I want
> to expose. But that
>
> may still be an undesirable design.
>
>
> package, like
> importClassesFrom("__GenomicRanges", "GRanges")
> exportClasses("GRanges")
> Surely that is not intended.
>
> It is important that my package works without being attached
> to the search
> path and I do this by carefully importing what I need, ie.
> my code does not
> require that my dependencies are attached to the search
> path. But the end
> user will be hosed without it.
>
>
> Yes s/he will. Fortunately when your package namespace gets loaded by
> another package, then nothing gets attached to the search path, even if
> your package depends (instead of imports) on other packages. So using
> Depends instead of Imports for your own dependencies won't make any
> difference in that respect, which is good.
>
>
> My impression is that the NOTE in R CMD check was written by
> someone who
> did not anticipate large-scale use and re-use of classes and
> methods across
> many packages.
>
>
> That's my impression too.
>
> Cheers,
> H.
>
>
> Best,
> Kasper
>
>
> On Tue, Oct 28, 2014 at 11:14 AM, James W. MacDonald
> <jmacdon at uw.edu <mailto:jmacdon at uw.edu>>
> wrote:
>
> I agree with Vince. It's your job as a package developer
> to make
> available to your package all the functions necessary
> for the package to
> work. But I am not sure it is your job to load all the
> packages that your
> end user might need.
>
> Best,
>
> Jim
>
>
>
> On Tue, Oct 28, 2014 at 11:04 AM, Vincent Carey <
> stvjc at channing.harvard.edu
> <mailto:stvjc at channing.harvard.edu>> wrote:
>
> On Tue, Oct 28, 2014 at 10:19 AM, Kasper Daniel Hansen <
> kasperdanielhansen at gmail.com
> <mailto:kasperdanielhansen at gmail.com>> wrote:
>
> What is the current best paradigm for using all
> the classes in
> S4Vectors/GenomeInfoDb/__GenomicRanges/IRanges
>
> I obviously import methods and classes from the
> relevant packages.
>
> But shouldn't I depend on these packages as
> well? Since I basically
>
> want
>
> the user to have this functionality at the
> command line? That is what
>
> I do
>
> now.
>
>
> I've wondered about this as well. It seems the
> principle is that the
> user
> should
> take care of attaching additional packages when
> needed. It might be
> appropriate
> to give a hint in the package startup message, if
> having some other
> package
> attached
> would typically be of great utility.
>
> Given your list above, I would think that depending
> on GenomicRanges
> would
> often
> be sufficient, and IRanges/S4Vectors would not
> require dependency
> assertion. I would
> think that GenomeInfoDb should be a voluntary
> attachment for a specific
> session.
>
> These are just my guesses -- I doubt there will be
> complete consensus,
> but
> I have
> started to think very critically about using
> Depends, and I think it is
> better when its
> use is minimized.
>
>
> That of course leads to the R CMD check NOTE on
> depending on too many
> packages.... I guess I should ignore that one.
>
> Best,
> Kasper
>
> [[alternative HTML version deleted]]
>
> _________________________________________________
> Bioc-devel at r-project.org
> <mailto:Bioc-devel at r-project.org> mailing list
> https://stat.ethz.ch/mailman/__listinfo/bioc-devel
> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
> [[alternative HTML version deleted]]
>
> _________________________________________________
> Bioc-devel at r-project.org
> <mailto:Bioc-devel at r-project.org> mailing list
> https://stat.ethz.ch/mailman/__listinfo/bioc-devel
> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
>
>
> [[alternative HTML version deleted]]
>
> _________________________________________________
> Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
> mailing list
> https://stat.ethz.ch/mailman/__listinfo/bioc-devel
> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org <mailto:hpages at fredhutch.org>
>
> Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
> Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list