[Bioc-devel] depends on packages providing classes

Vincent Carey stvjc at channing.harvard.edu
Tue Oct 28 20:42:32 CET 2014


On Tue, Oct 28, 2014 at 2:29 PM, Hervé Pagès <hpages at fredhutch.org> wrote:

> Hi,
>
> On 10/28/2014 08:48 AM, Vincent Carey wrote:
>
>> On Tue, Oct 28, 2014 at 11:23 AM, Kasper Daniel Hansen <
>> kasperdanielhansen at gmail.com> wrote:
>>
>>  Well, first I want to make sure that there is not something special
>>> regarding S4 methods and classes. I have a feeling that they are a
>>> special
>>> case.
>>>
>>> Second, while I agree with Jim's general opinion, it is a little bit
>>> different when I have return objects which are defined in other packages.
>>> If I don't depend on this other package, the user is hosed wrt. the
>>> return
>>> object, unless I manually export all classes from this other
>>>
>>>
>> In what sense?  If you return an instance of GRanges, certain things can
>> be
>> done
>> even if GenomicRanges is not attached.
>>
>
> Yes certain things maybe, but it's hard to predict which ones.
>
>   You can get values of slots, for
>> example.
>>
>> With the following little package
>>
>> %vjcair> cat foo/NAMESPACE
>>
>> importFrom(IRanges, IRanges)
>>
>> importClassesFrom(GenomicRanges, GRanges)
>>
>> importFrom(GenomicRanges, GRanges)
>>
>> export(myfun)
>>
>>
>>
>> %vjcair> cat foo/DESCRIPTION
>>
>> Package: foo
>>
>> Title: foo
>>
>> Version: 0.0.0
>>
>> Author: VJ Carey <stvjc at channing.harvard.edu>
>>
>> Description:
>>
>> Suggests:
>>
>> Depends:
>>
>> Imports: GenomicRanges
>>
>> Maintainer: VJ Carey <stvjc at channing.harvard.edu>
>>
>> License: Private
>>
>> LazyLoad: yes
>>
>>
>>
>> %vjcair> cat foo/R/*
>>
>> myfun = function(seqnames="1", ranges=IRanges(1,2), ...)
>>
>>     GRanges(seqnames=seqnames, ranges=ranges, ...)
>>
>>
>> The following works:
>>
>>
>>  library(foo)
>>>
>>
>>  x = myfun()
>>>
>>
>>  x
>>>
>>
>> GRanges object with 1 range and 0 metadata columns:
>>
>>        seqnames    ranges strand
>>
>>           <Rle> <IRanges>  <Rle>
>>
>>    [1]        1    [1, 2]      *
>>
>>    -------
>>
>>    seqinfo: 1 sequence from an unspecified genome; no seqlengths
>>
>>
>> So the show method works, even though I have not touched it.  (I did not
>>
>> expect it to work, in fact.)
>>
>
> Exactly. Let's call it luck ;-)
>
>   Additionally, I can get access to slots.
>>
>
> The end user should never try to access slots directly but use getters
> and setters instead. And most getters and setters for GRanges objects
> are defined and documented in the GenomicRanges package. Those that are
> not are defined in packages that GenomicRanges depends on.
>
>   But
>> ranges()
>>
>> fails.  If I, the user, want to use it, I need to arrange for that.
>>
>
> IMO if your package returns a GRanges object to the user, then the user
> should be able to access the man page for GRanges objects with ?GRanges.
>

Oddly enough, that seems to be incorrect.  I added a man page to foo that
has
a \link[GenomicRanges]{GRanges-class}.  I ran help.start and the cross
reference
from my man page succeeds.  Furthermore with the sessionInfo below, ?GRanges
succeeds at the CLI.  I am not trying to defend the NOTE but the principle
of minimizing
Depends declarations needs to be considered critically, and I am just
exploring the space.

> ?GRanges  # it worked as usual in the tty

> sessionInfo()

R version 3.1.1 (2014-07-10)

Platform: x86_64-apple-darwin13.1.0 (64-bit)


locale:

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


attached base packages:

[1] stats     graphics  grDevices datasets  utils     tools     methods

[8] base


other attached packages:

[1] foo_0.0.0            rmarkdown_0.3.8      knitr_1.6

[4] weaver_1.31.0        codetools_0.2-9      digest_0.6.4

[7] BiocInstaller_1.16.0


loaded via a namespace (and not attached):

 [1] BiocGenerics_0.11.5   evaluate_0.5.5        formatR_1.0

 [4] GenomeInfoDb_1.1.26   GenomicRanges_1.17.48 htmltools_0.2.6

 [7] IRanges_1.99.32       parallel_3.1.1        S4Vectors_0.2.8

[10] stats4_3.1.1          stringr_0.6.2         XVector_0.5.8


> And that works only if the GenomicRanges package is attached. Attaching
> GenomicRanges will also attach other packages that GenomicRanges depends
> on where some GRanges accessors might be defined and documented (e.g.
> metadata()).
>
>
>>
>> In some cases you'll decide you want the user to have a full complement of
>>
>> methods for your package to function meaningfully.  For example, I am
>> considering
>>
>> using dplyr idioms to work with data structures in a package, and it seems
>> I should
>>
>> just depend on dplyr rather than pick out and document which things I want
>> to expose.  But that
>>
>> may still be an undesirable design.
>>
>>
>>  package, like
>>>    importClassesFrom("GenomicRanges", "GRanges")
>>>    exportClasses("GRanges")
>>> Surely that is not intended.
>>>
>>> It is important that my package works without being attached to the
>>> search
>>> path and I do this by carefully importing what I need, ie. my code does
>>> not
>>> require that my dependencies are attached to the search path.  But the
>>> end
>>> user will be hosed without it.
>>>
>>
> Yes s/he will. Fortunately when your package namespace gets loaded by
> another package, then nothing gets attached to the search path, even if
> your package depends (instead of imports) on other packages. So using
> Depends instead of Imports for your own dependencies won't make any
> difference in that respect, which is good.
>
>
>>> My impression is that the NOTE in R CMD check was written by someone who
>>> did not anticipate large-scale use and re-use of classes and methods
>>> across
>>> many packages.
>>>
>>
> That's my impression too.
>
> Cheers,
> H.
>
>
>>> Best,
>>> Kasper
>>>
>>>
>>> On Tue, Oct 28, 2014 at 11:14 AM, James W. MacDonald <jmacdon at uw.edu>
>>> wrote:
>>>
>>>  I agree with Vince. It's your job as a package developer to make
>>>> available to your package all the functions necessary for the package to
>>>> work. But I am not sure it is your job to load all the packages that
>>>> your
>>>> end user might need.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>
>>>> On Tue, Oct 28, 2014 at 11:04 AM, Vincent Carey <
>>>> stvjc at channing.harvard.edu> wrote:
>>>>
>>>>  On Tue, Oct 28, 2014 at 10:19 AM, Kasper Daniel Hansen <
>>>>> kasperdanielhansen at gmail.com> wrote:
>>>>>
>>>>>  What is the current best paradigm for using all the classes in
>>>>>> S4Vectors/GenomeInfoDb/GenomicRanges/IRanges
>>>>>>
>>>>>> I obviously import methods and classes from the relevant packages.
>>>>>>
>>>>>> But shouldn't I depend on these packages as well?  Since I basically
>>>>>>
>>>>> want
>>>>>
>>>>>> the user to have this functionality at the command line? That is what
>>>>>>
>>>>> I do
>>>>>
>>>>>> now.
>>>>>>
>>>>>>
>>>>>>  I've wondered about this as well.  It seems the principle is that the
>>>>> user
>>>>> should
>>>>> take care of attaching additional packages when needed.  It might be
>>>>> appropriate
>>>>> to give a hint in the package startup message, if having some other
>>>>> package
>>>>> attached
>>>>> would typically be of great utility.
>>>>>
>>>>> Given your list above, I would think that depending on GenomicRanges
>>>>> would
>>>>> often
>>>>> be sufficient, and IRanges/S4Vectors would not require dependency
>>>>> assertion.  I would
>>>>> think that GenomeInfoDb should be a voluntary attachment for a specific
>>>>> session.
>>>>>
>>>>> These are just my guesses -- I doubt there will be complete consensus,
>>>>> but
>>>>> I have
>>>>> started to think very critically about using Depends, and I think it is
>>>>> better when its
>>>>> use is minimized.
>>>>>
>>>>>
>>>>>  That of course leads to the R CMD check NOTE on depending on too many
>>>>>> packages.... I guess I should ignore that one.
>>>>>>
>>>>>> Best,
>>>>>> Kasper
>>>>>>
>>>>>>          [[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioc-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>>>
>>>>>>
>>>>>          [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioc-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> University of Washington
>>>> Environmental and Occupational Health Sciences
>>>> 4225 Roosevelt Way NE, # 100
>>>> Seattle WA 98105-6099
>>>>
>>>>
>>>
>>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fredhutch.org
>
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list