[Rd] cache most-recent dispatch

Valerie Obenchain vobencha at fhcrc.org
Tue Jul 2 07:04:34 CEST 2013


Hi,

S4 method dispatch can be very slow. Would it be reasonable to cache the 
most
recent dispatch, anticipating the next invocation will be on the same 
type? This
would be very helpful in loops.

   fun0 <- function(x)
       sapply(x, paste, collapse="+")
   fun1 <- function(x) {
       paste <- selectMethod(paste, class(x[[1]]))
       sapply(x, paste, collapse="+")
   }
   lst <- split(rep(LETTERS, 100), rep(1:1300, 2))

   library(microbenchmark)
   microbenchmark(fun0(lst), times=10)
   ## Unit: milliseconds
   ##       expr      min       lq   median      uq      max neval
   ##  fun0(lst) 4.153287 4.180659 4.513539 5.19261 5.280481    10

   setGeneric("paste")
   microbenchmark(fun0(lst), fun1(lst), times=10)
   ## >     microbenchmark(fun0(lst), fun1(lst), times=10)
   ## Unit: milliseconds
   ##       expr       min       lq    median        uq       max neval
   ##  fun0(lst) 21.093180 21.27616 21.453174 21.833686 24.758791    10
   ##  fun1(lst)  4.517808  4.53067  4.582641  4.682235  5.121856    10

Dispatch seems to be especially slow when packages are involved, e.g.,
with the Bioconductor IRanges package
(http://bioconductor.org/packages/release/bioc/html/IRanges.html)

   removeGeneric("paste")
   library(IRanges)
   showMethods(paste)
   ## Function: paste (package BiocGenerics)
   ## ...="ANY"
   ## ...="Rle"
   selectMethod(paste, "ANY")
   ## Method Definition (Class "derivedDefaultMethod"):
   ##
   ## function (..., sep = " ", collapse = NULL)
   ## .Internal(paste(list(...), sep, collapse))
   ## <environment: namespace:base>
   ##
   ## Signatures:
   ##         ...
   ## target  "ANY"
   ## defined "ANY"

   microbenchmark(fun0(lst), fun1(lst), times=10)
   ## Unit: milliseconds
   ##       expr        min         lq     median         uq        max 
neval
   ##  fun0(lst) 233.539585 234.592491 236.311209 237.268506 243.181123 
    10
   ##  fun1(lst)   4.564914   4.592996   4.642898   4.729009   5.492706 
    10

   sessionInfo()
   ## R version 3.0.0 Patched (2013-04-04 r62492)
   ## Platform: x86_64-unknown-linux-gnu (64-bit)
   ##
   ## locale:
   ##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
   ##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
   ##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
   ##  [7] LC_PAPER=C                 LC_NAME=C
   ##  [9] LC_ADDRESS=C               LC_TELEPHONE=C
   ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
   ##
   ## attached base packages:
   ## [1] parallel  stats     graphics  grDevices utils     datasets 
methods
   ## [8] base
   ##
   ## other attached packages:
   ## [1] IRanges_1.19.15      BiocGenerics_0.7.2   microbenchmark_1.3-0
   ##
   ## loaded via a namespace (and not attached):
   ## [1] stats4_3.0.0


Thanks,
Valerie



More information about the R-devel mailing list