[Rd] speedbump in library
Michael Lawrence
lawrence.michael at gene.com
Mon Jan 26 14:12:55 CET 2015
A isNamespaceLoaded() function would be a useful thing to have in
general if we are interested in readable code. An efficient
implementation would be just a bonus.
On Mon, Jan 26, 2015 at 3:36 AM, Martin Maechler
<maechler at lynne.stat.math.ethz.ch> wrote:
>>>>>> Winston Chang <winstonchang1 at gmail.com>
>>>>>> on Fri, 23 Jan 2015 10:15:53 -0600 writes:
>
> > I think you can simplify a little by replacing this:
>
> > pkg %in% loadedNamespaces()
> > with this:
> > .getNamespace(pkg)
>
> almost: It would be
>
> !is.null(.getNamespace(pkg))
>
> > Whereas getNamespace(pkg) will load the package if it's not already
> > loaded, calling .getNamespace(pkg) (note the leading dot) won't load
> > the package.
>
> indeed.
> And you, Winston, are right that this new code snippet would be
> an order of magnitude faster :
>
> ##-----------------------------------------------------------------------------
>
> f1 <- function(pkg) pkg %in% loadedNamespaces()
> f2 <- function(pkg) !is.null(.getNamespace(pkg))
>
> require(microbenchmark)
>
> pkg <- "foo"; (mbM <- microbenchmark(r1 <- f1(pkg), r2 <- f2(pkg))); stopifnot(identical(r1,r2)); r1
> ## Unit: microseconds
> ## expr min lq mean median uq max neval cld
> ## r1 <- f1(pkg) 38.516 40.9790 42.35037 41.7245 42.4060 82.922 100 b
> ## r2 <- f2(pkg) 1.331 1.8285 2.13874 2.0855 2.3365 7.252 100 a
> ## [1] FALSE
>
> pkg <- "stats"; (mbM <- microbenchmark(r1 <- f1(pkg), r2 <- f2(pkg))); stopifnot(identical(r1,r2)); r1
> ## Unit: microseconds
> ## expr min lq mean median uq max neval cld
> ## r1 <- f1(pkg) 29.955 31.2575 32.27748 31.6035 32.1215 62.428 100 b
> ## r2 <- f2(pkg) 1.067 1.4315 1.71437 1.6335 1.8460 9.169 100 a
> ## [1] TRUE
> loadNamespace("Matrix")
> ## <environment: namespace:Matrix>
> pkg <- "Matrix"; (mbM <- microbenchmark(r1 <- f1(pkg), r2 <- f2(pkg))); stopifnot(identical(r1,r2)); r1
> ## Unit: microseconds
> ## expr min lq mean median uq max neval cld
> ## r1 <- f1(pkg) 32.721 33.5205 35.17450 33.9505 34.6050 65.373 100 b
> ## r2 <- f2(pkg) 1.010 1.3750 1.93671 1.5615 1.7795 12.128 100 a
> ## [1] TRUE
>
> ##-----------------------------------------------------------------------------
>
> Hence, indeed,
> !is.null(.getNamespace(pkg))
>
> seems equivalent to
> pkg %in% loadedNamespaces()
>
> --- when 'pkg' is of length 1 (!!!)
>
> but is 20 times faster.... and we have
> 11 occurrences of ' <...> %in% loadedNamespaces() '
> in the "base packages" in the R (devel) sources,
> 3 in base, 2 in methods, 3 in stats, 2 in tools, 1 in utils..
>
> On the other hand,
> pkg %in% loadedNamespaces()
>
> is extremely nicely readable code, whereas
> !is.null(.getNamespace(pkg))
> is pretty much the contrary.
> .. and well readable code is so much easier to maintain etc,
> such that in many cases, code optimization with the cost of
> code obfuscation is *not* desirable.
>
> Of course we could yet again use a few lines of C and R code to
> provide a new R lowlevel function, say
>
> is.loadedNamespace()
>
> which would be even faster than !is.null(.getNamespace(pkg))
>
> ...
> ...
>
> but do we have *any* evidence that this would noticably speedup
> any higher level function such as library() ?
>
>
> Thank you, again, Winston; you've opened an interesting topic!
>
> --
> Martin Maechler, ETH Zurich
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list