[Rd] efficiency and memory use of S4 data objects
Gordon Smyth
smyth at wehi.edu.au
Thu Aug 21 20:12:37 MEST 2003
I do lots of analyses on large microarray data sets so memory use and speed
and both important issues for me. I have been trying to estimate the
overheads associated with using formal S4 data objects instead of ordinary
lists for large data objects. In some simple experiments (using R 1.7.1 in
Windows 2000) with large but simple objects it seems that giving a data
object a formal class definition and using extractor and assignment
functions may increase both memory usage and the time taken by simple
numeric operations by several fold.
Here is a test function which uses a list representation to add 1 to the
elements of a long numeric vector:
addlist <- function(len,iter) {
object <- list(x=rnorm(len))
for (i in 1:iter) object$x <- object$x+1
object
}
Typical times on my machine are:
> system.time(a <- addlist(10^6,10))
[1] 2.91 0.00 2.96 NA NA
> system.time(addlist(10^7,10))
[1] 28.03 0.44 28.65 NA NA
Here is a test function doing the same operation with a formal S4 data
representation:
addS4 <- function(len,iter) {
object <- new("MyClass",x=rnorm(len))
for (i in 1:iter) x(object) <- x(object)+1
object
}
The timing with len=10^6 increases to
> system.time(a <- addS4(10^6,10))
[1] 6.79 0.06 6.90 NA NA
With len=10^7 the operation fails altogether due to insufficient memory
after thrashing around with virtual memory for a very long time.
I guess I'm not surprised by the performance penalty with S4. My question
is: is the performance penalty likely to be an ongoing feature of using S4
methods or will it likely go away in future versions of R?
Thanks
Gordon
Here are my S4 definitions:
setClass("MyClass",representation(x="numeric"))
setGeneric("x",function(object) standardGeneric("x"))
setMethod("x","MyClass",function(object) object at x)
setGeneric("x<-", function(object, value) standardGeneric("x<-"))
setReplaceMethod("x","MyClass",function(object,value) {object at x <- value;
return(object)})
> version
_
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 1
minor 7.1
year 2003
month 06
day 16
language R
More information about the R-devel
mailing list