[Rd] reducing redundant work in methods package

Peter Haverty haverty.peter at gene.com
Wed Jan 21 19:41:52 CET 2015


Hi all,

The function call series genericForPrimitive -> .findBasicFuns -> .findAll
happens 4400 times while the GenomicRanges package is loading.  Each time
.findAll follows a chain of environments to determine that the methods
namespace is the only one that holds a variable called .BasicFunsList. This
accounts for ~10% of package loading time. I'm sure there is some history
to that design, but would it be possible shortcut this operation? Could
.BasicFunsList be initialized in the methods namespace at startup and might
genericForPrimitive just go straight there?

Does anyone on the list know why it works this way?

There are some other cases of seemingly redundant work, but this seems like
an easy one to address.

I have included some code below that was used to investigate some of the
above.

# Try this to count calls to a function

.count <-  0; trace(methods:::.findBasicFuns,tracer=function() { .count <<-
.count + 1 }); library(GenomicRanges); print(.count)

# Try this to capture the input and output of a set of functions you wish
to refactor

.init_test_data_collection <- function(ns = asNamespace("methods")) {

    funs = c("isClassUnion", "getClass", "genericForPrimitive",
"possibleExtends", ".dataSlot", ".requirePackage", ".classEnv",
"getClassDef", "outerLabels", ".getClassFromCache", "getFunction")

    message(paste0("\nCollecting data for unit tests on ", paste(funs,
collapse=", "), " ...\n"))

    # Make env with list to hold test input/output

    TEST_ENV <- new.env()

    for (fname in funs) {

        # Make placeholder for input/output for future runs of this function

        TEST_ENV[[fname]] = list()  # Actually probably not necessary, will
just be c(NULL, list(first result)) the first time

        # Construct test version of function

        unlockBinding(fname, ns)

        fun = get(fname, envir=ns, mode="function")

        funbody = deparse(body(fun))

        newfun <- fun

        newfun.body = c(

            sprintf("fname = '%s'", fname),

            "TEST_INFO = list()",

            "TEST_INFO$input = mget(names(formals(fname)))",

            c("realfun <- function()", funbody),

            "TEST_INFO$output = realfun()",

            "TEST_ENV[[fname]] = c(TEST_ENV[[fname]], list(TEST_INFO))",

            "return(TEST_INFO$output)")

        body(newfun) = as.call(c(as.name("{"),
as.list(parse(text=newfun.body))))

        assign(fname, newfun, envir=ns)

    }

    return(TEST_ENV)

}
# run code, print items in TEST_ENV

The relevant code is in methods/R/BasicFunsList.R and
methods/R/ClassExtensions.R
Pete

____________________
Peter M. Haverty, Ph.D.
Genentech, Inc.
phaverty at gene.com

	[[alternative HTML version deleted]]



More information about the R-devel mailing list