[Rd] cache most-recent dispatch
Valerie Obenchain
vobencha at fhcrc.org
Tue Jul 2 07:04:34 CEST 2013
Hi,
S4 method dispatch can be very slow. Would it be reasonable to cache the
most
recent dispatch, anticipating the next invocation will be on the same
type? This
would be very helpful in loops.
fun0 <- function(x)
sapply(x, paste, collapse="+")
fun1 <- function(x) {
paste <- selectMethod(paste, class(x[[1]]))
sapply(x, paste, collapse="+")
}
lst <- split(rep(LETTERS, 100), rep(1:1300, 2))
library(microbenchmark)
microbenchmark(fun0(lst), times=10)
## Unit: milliseconds
## expr min lq median uq max neval
## fun0(lst) 4.153287 4.180659 4.513539 5.19261 5.280481 10
setGeneric("paste")
microbenchmark(fun0(lst), fun1(lst), times=10)
## > microbenchmark(fun0(lst), fun1(lst), times=10)
## Unit: milliseconds
## expr min lq median uq max neval
## fun0(lst) 21.093180 21.27616 21.453174 21.833686 24.758791 10
## fun1(lst) 4.517808 4.53067 4.582641 4.682235 5.121856 10
Dispatch seems to be especially slow when packages are involved, e.g.,
with the Bioconductor IRanges package
(http://bioconductor.org/packages/release/bioc/html/IRanges.html)
removeGeneric("paste")
library(IRanges)
showMethods(paste)
## Function: paste (package BiocGenerics)
## ...="ANY"
## ...="Rle"
selectMethod(paste, "ANY")
## Method Definition (Class "derivedDefaultMethod"):
##
## function (..., sep = " ", collapse = NULL)
## .Internal(paste(list(...), sep, collapse))
## <environment: namespace:base>
##
## Signatures:
## ...
## target "ANY"
## defined "ANY"
microbenchmark(fun0(lst), fun1(lst), times=10)
## Unit: milliseconds
## expr min lq median uq max
neval
## fun0(lst) 233.539585 234.592491 236.311209 237.268506 243.181123
10
## fun1(lst) 4.564914 4.592996 4.642898 4.729009 5.492706
10
sessionInfo()
## R version 3.0.0 Patched (2013-04-04 r62492)
## Platform: x86_64-unknown-linux-gnu (64-bit)
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=C LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets
methods
## [8] base
##
## other attached packages:
## [1] IRanges_1.19.15 BiocGenerics_0.7.2 microbenchmark_1.3-0
##
## loaded via a namespace (and not attached):
## [1] stats4_3.0.0
Thanks,
Valerie
More information about the R-devel
mailing list