[Rd] speedbump in library

Martin Maechler maechler at lynne.stat.math.ethz.ch
Mon Jan 26 14:51:39 CET 2015


>>>>> Michael Lawrence <lawrence.michael at gene.com>
>>>>>     on Mon, 26 Jan 2015 05:12:55 -0800 writes:

    > A isNamespaceLoaded() function would be a useful thing to
    > have in general if we are interested in readable code. An
    > efficient implementation would be just a bonus.

Good point (readability), and thank you for the support!

Note one slight drawback with your name (which is clearly
better than mine first proposal):
We'd have the three functions named

     isBaseNamespace(ns)
     isNamespace(ns)
     isNamespaceLoaded(name)

where  'name' is really different from 'ns', namely :

  > isNamespace("stats")
  [1] FALSE
  > isNamespace(asNamespace("stats"))
  [1] TRUE

but

  > isNamespaceLoaded("stats")
  [1] TRUE

  > isNamespaceLoaded(asNamespace("stats"))
  Error in as.vector(x, "symbol") : 
    cannot coerce type 'environment' to vector of type 'symbol'
  > 

So, from my (non native English view) a slightly more suggestive
function name may be 

   isLoadedNamespace(name)

or using Luke's original language, still present in the C code,

   isRegisteredNamespace(name)

but I would prefer the former,  isLoadedN..S..()

Martin

    > On Mon, Jan 26, 2015 at 3:36 AM, Martin Maechler
    > <maechler at lynne.stat.math.ethz.ch> wrote:
    >>>>>>> Winston Chang <winstonchang1 at gmail.com> on Fri, 23
    >>>>>>> Jan 2015 10:15:53 -0600 writes:
    >> 
    >> > I think you can simplify a little by replacing this:
    >> 
    >> > pkg %in% loadedNamespaces() > with this: >
    >> .getNamespace(pkg)
    >> 
    >> almost: It would be
    >> 
    >> !is.null(.getNamespace(pkg))
    >> 
    >> > Whereas getNamespace(pkg) will load the package if it's
    >> not already > loaded, calling .getNamespace(pkg) (note
    >> the leading dot) won't load > the package.
    >> 
    >> indeed.  And you, Winston, are right that this new code
    >> snippet would be an order of magnitude faster :
    >> 
    >> ##-----------------------------------------------------------------------------
    >> 
    >> f1 <- function(pkg) pkg %in% loadedNamespaces() f2 <-
    >> function(pkg) !is.null(.getNamespace(pkg))
    >> 
    >> require(microbenchmark)
    >> 
    >> pkg <- "foo"; (mbM <- microbenchmark(r1 <- f1(pkg), r2 <-
    >> f2(pkg))); stopifnot(identical(r1,r2)); r1 ## Unit:
    >> microseconds ## expr min lq mean median uq max neval cld
    >> ## r1 <- f1(pkg) 38.516 40.9790 42.35037 41.7245 42.4060
    >> 82.922 100 b ## r2 <- f2(pkg) 1.331 1.8285 2.13874 2.0855
    >> 2.3365 7.252 100 a ## [1] FALSE
    >> 
    >> pkg <- "stats"; (mbM <- microbenchmark(r1 <- f1(pkg), r2
    >> <- f2(pkg))); stopifnot(identical(r1,r2)); r1 ## Unit:
    >> microseconds ## expr min lq mean median uq max neval cld
    >> ## r1 <- f1(pkg) 29.955 31.2575 32.27748 31.6035 32.1215
    >> 62.428 100 b ## r2 <- f2(pkg) 1.067 1.4315 1.71437 1.6335
    >> 1.8460 9.169 100 a ## [1] TRUE loadNamespace("Matrix") ##
    >> <environment: namespace:Matrix> pkg <- "Matrix"; (mbM <-
    >> microbenchmark(r1 <- f1(pkg), r2 <- f2(pkg)));
    >> stopifnot(identical(r1,r2)); r1 ## Unit: microseconds ##
    >> expr min lq mean median uq max neval cld ## r1 <- f1(pkg)
    >> 32.721 33.5205 35.17450 33.9505 34.6050 65.373 100 b ##
    >> r2 <- f2(pkg) 1.010 1.3750 1.93671 1.5615 1.7795 12.128
    >> 100 a ## [1] TRUE
    >> 
    >> ##-----------------------------------------------------------------------------
    >> 
    >> Hence, indeed, !is.null(.getNamespace(pkg))
    >> 
    >> seems equivalent to pkg %in% loadedNamespaces()
    >> 
    >> --- when 'pkg' is of length 1 (!!!)
    >> 
    >> but is 20 times faster....  and we have 11 occurrences of
    >> ' <...> %in% loadedNamespaces() ' in the "base packages"
    >> in the R (devel) sources, 3 in base, 2 in methods, 3 in
    >> stats, 2 in tools, 1 in utils..
    >> 
    >> On the other hand, pkg %in% loadedNamespaces()
    >> 
    >> is extremely nicely readable code, whereas
    >> !is.null(.getNamespace(pkg)) is pretty much the contrary.
    >> .. and well readable code is so much easier to maintain
    >> etc, such that in many cases, code optimization with the
    >> cost of code obfuscation is *not* desirable.
    >> 
    >> Of course we could yet again use a few lines of C and R
    >> code to provide a new R lowlevel function, say
    >> 
    >> is.loadedNamespace()
    >> 
    >> which would be even faster than
    >> !is.null(.getNamespace(pkg))
    >> 
    >> ...  ...
    >> 
    >> but do we have *any* evidence that this would noticably
    >> speedup any higher level function such as library() ?
    >> 
    >> 
    >> Thank you, again, Winston; you've opened an interesting
    >> topic!
    >> 
    >> --
    >> Martin Maechler, ETH Zurich
    >> 
    >> ______________________________________________
    >> R-devel at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list