[R] efficient code. how to reduce running time?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Jan 22 17:06:12 CET 2007
On Mon, 22 Jan 2007, Charilaos Skiadas wrote:
> On Jan 21, 2007, at 8:11 PM, John Fox wrote:
>
>> Dear Haris,
>>
>> Using lapply() et al. may produce cleaner code, but it won't
>> necessarily
>> speed up a computation. For example:
>>
>>> X <- data.frame(matrix(rnorm(1000*1000), 1000, 1000))
>>> y <- rnorm(1000)
>>>
>>> mods <- as.list(1:1000)
>>> system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
>> [1] 40.53 0.05 40.61 NA NA
>>>
>>> system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
>> [1] 53.29 0.37 53.94 NA NA
>>
> Interesting, in my system the results are quite different:
>
> > system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
> [1] 192.035 12.601 797.094 0.000 0.000
> > system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
> [1] 59.913 9.918 289.030 0.000 0.000
>
> Regular MacOSX install with ~760MB memory.
But MacOS X is infamous for having rather specific speed problems with its
malloc, and so gives different timing results from all other platforms.
We are promised a solution in MacOS 10.5.
Both of your machines seem very slow compared to mine:
> system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
user system elapsed
11.011 0.250 11.311
> system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
user system elapsed
13.463 0.260 13.812
and that on a 64-bit platform (AMD64 Linux FC5).
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list