[R] efficient code. how to reduce running time?

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jan 22 17:06:12 CET 2007


On Mon, 22 Jan 2007, Charilaos Skiadas wrote:

> On Jan 21, 2007, at 8:11 PM, John Fox wrote:
>
>> Dear Haris,
>>
>> Using lapply() et al. may produce cleaner code, but it won't
>> necessarily
>> speed up a computation. For example:
>>
>>> X <- data.frame(matrix(rnorm(1000*1000), 1000, 1000))
>>> y <- rnorm(1000)
>>>
>>> mods <- as.list(1:1000)
>>> system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
>> [1] 40.53  0.05 40.61    NA    NA
>>>
>>> system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
>> [1] 53.29  0.37 53.94    NA    NA
>>
> Interesting, in my system the results are quite different:
>
> > system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
> [1] 192.035  12.601 797.094   0.000   0.000
> > system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
> [1]  59.913   9.918 289.030   0.000   0.000
>
> Regular MacOSX install with ~760MB memory.

But MacOS X is infamous for having rather specific speed problems with its 
malloc, and so gives different timing results from all other platforms.
We are promised a solution in MacOS 10.5.

Both of your machines seem very slow compared to mine:

> system.time(for (i in 1:1000) mods[[i]] <- lm(y ~ X[,i]))
    user  system elapsed
  11.011   0.250  11.311
> system.time(mods <- lapply(as.list(X), function(x) lm(y ~ x)))
    user  system elapsed
  13.463   0.260  13.812

and that on a 64-bit platform (AMD64 Linux FC5).

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list