[Bioc-devel] Confusing namespace issue with IRanges 1.99.17
Hervé Pagès
hpages at fhcrc.org
Tue Jul 8 17:15:25 CEST 2014
Hi guys,
On 07/08/2014 05:29 AM, Michael Lawrence wrote:
> This is why I tell people not to use require(). But what's with needing to
> load IRanges to subset an Rle? Is that temporary?
Very temporary. The source code of the "extractROWS" and "replaceROWS"
methods for Rle objects actually contains the following comment:
## FIXME: Right now, the subscript 'i' is turned into an IRanges
## object so we need stuff that lives in the IRanges package for this
## to work. This is ugly/hacky and needs to be fixed (thru a redesign
## of this method).
if (!suppressWarnings(require(IRanges, quietly=TRUE)))
stop(...)
...
I introduced this hack last week when I moved the Rle code from IRanges
to S4Vectors. It's temporary. The 2 methods need to be refactored which
I'm planning to do this week.
Cheers,
H.
>
> Limiting imports is unlikely to reduce loading time. It may actually
> increase it. There are good reasons for it though.
>
>
>
> On Tue, Jul 8, 2014 at 5:21 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>
>> Hi Leonardo --
>>
>>
>> On 07/07/2014 03:27 PM, Leonardo Collado Torres wrote:
>>
>>> Hello BioC-devel list,
>>>
>>> I am currently confused on a namespace issue which I haven't been able
>>> to solve. To reproduce this, I made the simplest example I thought of.
>>>
>>>
>>> Step 1: make some toy data and save it on your desktop
>>>
>>> library(IRanges)
>>> DF <- DataFrame(x = Rle(0, 10), y = Rle(1, 10))
>>> save(DF, file="~/Desktop/DF.Rdata")
>>>
>>> Step 2: install the toy package on R 3.1.x
>>>
>>> library(devtools)
>>> install_github("lcolladotor/fooPkg")
>>> # Note that it passes R CMD check
>>>
>>> Step 3: on a new R session run
>>>
>>> example("foo", "fooPkg")
>>> # Change the location of DF.Rdata if necessary
>>>
>>>
>>> You will see that when running the example, the session information is
>>> printed listing:
>>>
>>> other attached packages:
>>> [1] fooPkg_0.0.1
>>>
>>> loaded via a namespace (and not attached):
>>> [1] BiocGenerics_0.11.3 IRanges_1.99.17 parallel_3.1.0
>>> S4Vectors_0.1.0 stats4_3.1.0 tools_3.1.0
>>>
>>>
>>> Then the message for loading IRanges is showed, which is something I
>>> was not expecting and thus the following session info shows:
>>>
>>> other attached packages:
>>> [1] IRanges_1.99.17 S4Vectors_0.1.0 BiocGenerics_0.11.3
>>> fooPkg_0.0.1
>>>
>>> loaded via a namespace (and not attached):
>>> [1] stats4_3.1.0 tools_3.1.0
>>>
>>> Meaning that IRanges, S4Vectors and BiocGenerics all went from "loaded
>>> via a namespace" to "other attached packages".
>>>
>>>
>>>
>>> All the fooPkg::foo() is doing is using a mapply() to go through a
>>> DataFrame and a list of indices to subset the data as shown at
>>> https://github.com/lcolladotor/fooPkg/blob/master/R/foo.R#L26 That is:
>>>
>>> res <- mapply(function(x, y) { x[y] }, DF, index)
>>>
>>> I thus thought that the only thing I would need to specify on the
>>> namespace is to import the '[' IRanges method.
>>>
>>> Checking with BiocCheck and codetoolsBioC suggests importing the
>>> method for mapply() from BiocGenerics. Doing so doesn't affect things
>>> and R still loads IRanges on that mapply() call. Importing the '['
>>> method from S4Vectors doesn't help either. Most intriging, importing
>>> the whole S4Vectors, BiocGenerics and IRanges still doesn't change the
>>> fact that IRanges is loaded when evaluating the same line of code
>>> shown above.
>>>
>>> Any clues on what I am missing or doing wrong?
>>>
>>>
>> This comes from S4Vectors::extractROWS
>>
>>> selectMethod(extractROWS, c("Rle", "integer"))
>> Method Definition:
>>
>> function (x, i)
>> {
>> if (!suppressWarnings(require(IRanges, quietly = TRUE)))
>> stop("Couldn't load the IRanges package. You need to install ",
>> "the IRanges\n package in order to subset an Rle object.")
>>
>> ...
>>
>> which moves the IRanges package from loaded to attached. Maybe that should
>> be 'suppressPackageStartupMessages' or if (!IRanges %in%
>> loadedNamespaces()) and functions referenced by IRanges:::...
>>
>>
>>
>>
>>
>>>
>>> In my use case, I'm trying to keep the namespace as small as possible
>>> (to minimize loading time) because it's for a tiny package that has a
>>> single function. This tiny package is then loaded on a
>>> BiocParallel::blapply() call using BiocParallel::SnowParam() which
>>> performs much better than BiocParallel::MulticoreParam() in terms of
>>> keeping the memory under control.
>>>
>>
>> probably it is not desirable to move packages from loaded to attached, but
>> I don't think this influences performance in a meaningful way?
>>
>> Martin
>>
>>
>>
>>>
>>>
>>>
>>> Thank you for your help!
>>> Leo
>>>
>>> Leonardo Collado Torres, PhD student
>>> Department of Biostatistics
>>> Johns Hopkins University
>>> Bloomberg School of Public Health
>>> Website: http://www.biostat.jhsph.edu/~lcollado/
>>> Blog: http://lcolladotor.github.io/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Full output from running the example:
>>>
>>>
>>>
>>>
>>> example("foo", "fooPkg")
>>>>
>>>
>>> foo> ## Initial info
>>> foo> sessionInfo()
>>> R version 3.1.0 (2014-04-10)
>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] fooPkg_0.0.1
>>>
>>> loaded via a namespace (and not attached):
>>> [1] BiocGenerics_0.11.3 IRanges_1.99.17 parallel_3.1.0
>>> S4Vectors_0.1.0 stats4_3.1.0 tools_3.1.0
>>>
>>> foo> ## Load data
>>> foo> load("~/Desktop/DF.Rdata")
>>>
>>> foo> ## Run function
>>> foo> result <- foo(DF)
>>> R version 3.1.0 (2014-04-10)
>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] fooPkg_0.0.1
>>>
>>> loaded via a namespace (and not attached):
>>> [1] BiocGenerics_0.11.3 IRanges_1.99.17 parallel_3.1.0
>>> S4Vectors_0.1.0 stats4_3.1.0 tools_3.1.0
>>> Loading required package: parallel
>>>
>>> Attaching package: ‘BiocGenerics’
>>>
>>> The following objects are masked from ‘package:parallel’:
>>>
>>> clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
>>> clusterExport, clusterMap, parApply, parCapply, parLapply,
>>> parLapplyLB, parRapply, parSapply, parSapplyLB
>>>
>>> The following object is masked from ‘package:stats’:
>>>
>>> xtabs
>>>
>>> The following objects are masked from ‘package:base’:
>>>
>>> anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
>>> do.call, duplicated, eval, evalq, Filter, Find, get,
>>> intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
>>> paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
>>> rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table,
>>> tapply, union, unique, unlist
>>>
>>> R version 3.1.0 (2014-04-10)
>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] parallel stats graphics grDevices utils datasets
>>> methods base
>>>
>>> other attached packages:
>>> [1] IRanges_1.99.17 S4Vectors_0.1.0 BiocGenerics_0.11.3
>>> fooPkg_0.0.1
>>>
>>> loaded via a namespace (and not attached):
>>> [1] stats4_3.1.0 tools_3.1.0
>>>
>>>>
>>>>
>>>
>>>
>>> The same thing happens with the following setup:
>>>
>>> R version 3.1.1 RC (2014-07-07 r66083)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] parallel stats graphics grDevices datasets utils methods
>>> [8] base
>>>
>>> other attached packages:
>>> [1] IRanges_1.99.17 S4Vectors_0.1.0 BiocGenerics_0.11.3
>>> [4] fooPkg_0.0.1 colorout_1.0-2
>>>
>>> loaded via a namespace (and not attached):
>>> [1] stats4_3.1.1 tools_3.1.1
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>>
>> --
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>>
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> [[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list