[Rd] speedbump in library

Michael Lawrence lawrence.michael at gene.com
Mon Jan 26 15:11:50 CET 2015


isLoadedNamespace() sounds fine to me..

Thanks for addressing this,
Michael

On Mon, Jan 26, 2015 at 5:51 AM, Martin Maechler <
maechler at lynne.stat.math.ethz.ch> wrote:

> >>>>> Michael Lawrence <lawrence.michael at gene.com>
> >>>>>     on Mon, 26 Jan 2015 05:12:55 -0800 writes:
>
>     > A isNamespaceLoaded() function would be a useful thing to
>     > have in general if we are interested in readable code. An
>     > efficient implementation would be just a bonus.
>
> Good point (readability), and thank you for the support!
>
> Note one slight drawback with your name (which is clearly
> better than mine first proposal):
> We'd have the three functions named
>
>      isBaseNamespace(ns)
>      isNamespace(ns)
>      isNamespaceLoaded(name)
>
> where  'name' is really different from 'ns', namely :
>
>   > isNamespace("stats")
>   [1] FALSE
>   > isNamespace(asNamespace("stats"))
>   [1] TRUE
>
> but
>
>   > isNamespaceLoaded("stats")
>   [1] TRUE
>
>   > isNamespaceLoaded(asNamespace("stats"))
>   Error in as.vector(x, "symbol") :
>     cannot coerce type 'environment' to vector of type 'symbol'
>   >
>
> So, from my (non native English view) a slightly more suggestive
> function name may be
>
>    isLoadedNamespace(name)
>
> or using Luke's original language, still present in the C code,
>
>    isRegisteredNamespace(name)
>
> but I would prefer the former,  isLoadedN..S..()
>
> Martin
>
>     > On Mon, Jan 26, 2015 at 3:36 AM, Martin Maechler
>     > <maechler at lynne.stat.math.ethz.ch> wrote:
>     >>>>>>> Winston Chang <winstonchang1 at gmail.com> on Fri, 23
>     >>>>>>> Jan 2015 10:15:53 -0600 writes:
>     >>
>     >> > I think you can simplify a little by replacing this:
>     >>
>     >> > pkg %in% loadedNamespaces() > with this: >
>     >> .getNamespace(pkg)
>     >>
>     >> almost: It would be
>     >>
>     >> !is.null(.getNamespace(pkg))
>     >>
>     >> > Whereas getNamespace(pkg) will load the package if it's
>     >> not already > loaded, calling .getNamespace(pkg) (note
>     >> the leading dot) won't load > the package.
>     >>
>     >> indeed.  And you, Winston, are right that this new code
>     >> snippet would be an order of magnitude faster :
>     >>
>     >>
> ##-----------------------------------------------------------------------------
>     >>
>     >> f1 <- function(pkg) pkg %in% loadedNamespaces() f2 <-
>     >> function(pkg) !is.null(.getNamespace(pkg))
>     >>
>     >> require(microbenchmark)
>     >>
>     >> pkg <- "foo"; (mbM <- microbenchmark(r1 <- f1(pkg), r2 <-
>     >> f2(pkg))); stopifnot(identical(r1,r2)); r1 ## Unit:
>     >> microseconds ## expr min lq mean median uq max neval cld
>     >> ## r1 <- f1(pkg) 38.516 40.9790 42.35037 41.7245 42.4060
>     >> 82.922 100 b ## r2 <- f2(pkg) 1.331 1.8285 2.13874 2.0855
>     >> 2.3365 7.252 100 a ## [1] FALSE
>     >>
>     >> pkg <- "stats"; (mbM <- microbenchmark(r1 <- f1(pkg), r2
>     >> <- f2(pkg))); stopifnot(identical(r1,r2)); r1 ## Unit:
>     >> microseconds ## expr min lq mean median uq max neval cld
>     >> ## r1 <- f1(pkg) 29.955 31.2575 32.27748 31.6035 32.1215
>     >> 62.428 100 b ## r2 <- f2(pkg) 1.067 1.4315 1.71437 1.6335
>     >> 1.8460 9.169 100 a ## [1] TRUE loadNamespace("Matrix") ##
>     >> <environment: namespace:Matrix> pkg <- "Matrix"; (mbM <-
>     >> microbenchmark(r1 <- f1(pkg), r2 <- f2(pkg)));
>     >> stopifnot(identical(r1,r2)); r1 ## Unit: microseconds ##
>     >> expr min lq mean median uq max neval cld ## r1 <- f1(pkg)
>     >> 32.721 33.5205 35.17450 33.9505 34.6050 65.373 100 b ##
>     >> r2 <- f2(pkg) 1.010 1.3750 1.93671 1.5615 1.7795 12.128
>     >> 100 a ## [1] TRUE
>     >>
>     >>
> ##-----------------------------------------------------------------------------
>     >>
>     >> Hence, indeed, !is.null(.getNamespace(pkg))
>     >>
>     >> seems equivalent to pkg %in% loadedNamespaces()
>     >>
>     >> --- when 'pkg' is of length 1 (!!!)
>     >>
>     >> but is 20 times faster....  and we have 11 occurrences of
>     >> ' <...> %in% loadedNamespaces() ' in the "base packages"
>     >> in the R (devel) sources, 3 in base, 2 in methods, 3 in
>     >> stats, 2 in tools, 1 in utils..
>     >>
>     >> On the other hand, pkg %in% loadedNamespaces()
>     >>
>     >> is extremely nicely readable code, whereas
>     >> !is.null(.getNamespace(pkg)) is pretty much the contrary.
>     >> .. and well readable code is so much easier to maintain
>     >> etc, such that in many cases, code optimization with the
>     >> cost of code obfuscation is *not* desirable.
>     >>
>     >> Of course we could yet again use a few lines of C and R
>     >> code to provide a new R lowlevel function, say
>     >>
>     >> is.loadedNamespace()
>     >>
>     >> which would be even faster than
>     >> !is.null(.getNamespace(pkg))
>     >>
>     >> ...  ...
>     >>
>     >> but do we have *any* evidence that this would noticably
>     >> speedup any higher level function such as library() ?
>     >>
>     >>
>     >> Thank you, again, Winston; you've opened an interesting
>     >> topic!
>     >>
>     >> --
>     >> Martin Maechler, ETH Zurich
>     >>
>     >> ______________________________________________
>     >> R-devel at r-project.org mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list