[R] apply vs sapply vs loop - lm() call appl(y)ied on array
Christoph Lehmann
christoph.lehmann at gmx.ch
Thu Apr 21 15:23:31 CEST 2005
Dear useRs
(Code of the now mentioned small example is below)
I have 7 * 8 * 9 = 504 series of data (each length 5). For each of
theses series I want to compute a lm(), where the designmatrx X is the
same for all these computations.
The 504 series are in an array of dimension d.dim <- c(5, 7, 8, 9)
means, the first dimension holds the data-series.
The lm computation needs performance optimization, since in fact the
dimensions are much larger. I compared the following approaches:
using a for-loop. using apply, and using sapply. All of these require
roughly the same time of computation. I was astonished since I expected
at least sapply to outperfomr the for-loop.
Do you have me another solution, which is faster? many thanks
here is the code
## ------------------------------------------------------
t.length <- 5
d.dim <- c(t.length,7,8,9) # dimesions: time, x, y, z
Y <- array( rep(1:t.length, prod(d.dim)) + rnorm(prod(d.dim), 0, 0.1),
d.dim)
X <- c(1,3,2,4,5)
## -------- performance tests
## using for loop
date()
z <- rep(0, prod(d.dim[2:4]))
l <- 0
for (i in 1:dim(Y)[4])
for (j in 1:dim(Y)[3])
for (k in 1:dim(Y)[2]) {
l <- l + 1
z[l] <- unlist(summary(lm(Y[,k, j, i] ~ X)))$r.squared
}
date()
## using apply
date()
z <- apply(Y, 2:4, function(x) unlist(summary(lm(x ~ X)))$r.squared)
date()
## using sapply
date()
fac <- rep(1:prod(d.dim[2:4]), rep(t.length, prod(d.dim[2:4])))
z <- sapply(split(as.vector(Y), fac), FUN = function(x)
unlist(summary(lm(x ~ X)))$r.squared)
dim(z) <- d.dim[2:4]
date()
## ------------------------------------------------------
--
Christoph Lehmann Phone: ++41 31 930 93 83
Department of Psychiatric Neurophysiology Mobile: ++41 76 570 28 00
University Hospital of Clinical Psychiatry Fax: ++41 31 930 99 61
Waldau lehmann at puk.unibe.ch
CH-3000 Bern 60 http://www.puk.unibe.ch/cl/pn_ni_cv_cl_04.html
More information about the R-help
mailing list