[R] pre-allocation not always a timesaver
Ross Boylan
ross at biostat.ucsf.edu
Fri Feb 28 04:53:20 CET 2014
The R Inferno advises that if you are building up results in pieces it's
best to pre-allocate the result object and fill it in. In some testing,
I see a benefit with this strategy for regular variables. However, when
the results are held by a class, the opposite seems to be the case.
Comments? Explanations?
Possibly for classes any update causes the entire object to be
replaced--perhaps to trigger the validation machinery?--and so
preallocation simply means on average a bigger object is being
manipulated.
Here is some test code, with CPU seconds given in the comments. I tried
everything twice in case there was some "first-time" overhead such as
growing total memory in the image. When the 2 times differed noticeably
I reported both values.
# class definitions
refbase <- setRefClass("refBase", fields = list(dispatch="ANY", myx="ANY"),
methods = list( initialize = function(x0=NULL, ...) {
usingMethods("foo")
dispatch <<- foo
myx <<- x0
}
# some irrelevant methods edited out
))
myclass <- setClass("simple", representation=list(myx="ANY"))
### Method 1: regular variables
pre <- function(n, j=1000) {
x <- array(dim=(c(j, n)))
for (i in 1:n) {
x[,i] <- rnorm(j)
}
x
}
system.time(pre(1000)) #0.3s
nopre <- function(n, j=1000) {
x <- numeric(0)
for (i in 1:n)
x <- c(x, rnorm(j))
x
}
system.time(nopre(1000)) # 2.0s, 2.7s
# Method 2: with ref class
pre2 <- function(n, j=1000) {
a <- refbase(x0=numeric(0))
a$myx <- array(dim=c(j, n))
for (i in 1:n) {
a$myx[,i] <- rnorm(j)
}
a$myx
}
system.time(pre2(1000)) # 4.0 s
nopre2 <- function(n, j=1000) {
a <- refbase(x0=numeric(0))
for (i in 1:n)
a$myx <- c(a$myx, rnorm(j))
a$myx
}
system.time(nopre2(1000)) # 2.9s, 4.3
# Method 3: with regular class
pre3 <- function(n, j=1000) {
a <- myclass()
a at myx <- array(dim=c(j, n))
for (i in 1:n) {
a at myx[,i] <- rnorm(j)
}
a at myx
}
system.time(pre3(1000)) # 7.3 s
nopre3 <- function(n, j=1000) {
a <- myclass(myx=numeric(0))
for (i in 1:n)
a at myx <- c(a at myx, rnorm(j))
a at myx
}
system.time(nopre3(1000)) # 4.2s
More information about the R-help
mailing list