[R] Operating on windows of data

Mon Mar 22 11:48:41 CET 2004

On Mon, Mar 22, 2004 at 01:39:28AM -0500, Gabor Grothendieck wrote:
> You can retain the trick of using subset and still get
> rid of the loop in:
> 
>    http://www.mayin.org/ajayshah/KB/R/EXAMPLES/rollingreg.R
> 
> by using sapply like this (untested):
> 
> dat <- sapply( seq(T-width), function(i) {
>     model <- lm(dlinrchf ~ dlusdchf + dljpychf + dldemchf, A, 
>                 i:(i+width-1))
>     details <- summary.lm(model)
>     tmp <- coefficients(model)
>     c( USD = tmp[2], JPY = tmp[3], DEM = tmp[4], 
>            R2 = details$r.squared, RMSE = details$sigma )
> } )
> dat <- as.data.frame(t(dat))
> attach(dat)

This brings me to a question I've always had about "the R way" of
avoiding loops. Yes, the sapply() approach above works. My question
is: Why is this much better than writing it using loops?

Loops tap into the intuition of millions of people who have grown up
around procedural languages. Atleast to a person like me, I can read
code involving loops effortlessly.

And I don't see how much faster the sapply() will be. Intuitively, we
may think that the sapply() results in C code getting executed (in the
R sources), while the for loop results in interpretation overhead, and
so the sapply() is surely faster. But when the body of the for loop
involves a weighty thing like a QR decomposition (for the OLS), that
would seem to dominate the cost - as far as I can tell.

-- 
Ajay Shah                                                   Consultant
ajayshah at mayin.org                      Department of Economic Affairs
http://www.mayin.org/ajayshah           Ministry of Finance, New Delhi