[Rd] Speed of for loops

Oleg Sklyar osklyar at ebi.ac.uk
Wed Jan 31 00:42:27 CET 2007


It is surely an elegant way of doing things (although far from being 
easy to parse visually) but is it really faster than a loop?

After all, the indexing problem is the same and sapply simply does the 
same job as for in this case, plus "<<-" will _search_ through the 
environment on every single step. Where is the gain?

Oleg

--
Dr Oleg Sklyar | EBI-EMBL, Cambridge CB10 1SD, UK | +44-1223-494466


Byron Ellis wrote:
> Actually, why not use a closure to store previous value(s)?
> 
> In the simple case, which depends on x_i and y_{i-1}
> 
> gen.iter = function(x) {
>     y = NA
>     function(i) {
>        y <<- if(is.na(y)) x[i] else y+x[i]
>     }
> }
> 
> y = sapply(1:10,gen.iter(x))
> 
> Obviously you can modify the function for the bookkeeping required to
> manage whatever lag you need. I use this sometimes when I'm
> implementing MCMC samplers of various kinds.
> 
> 
> On 1/30/07, Herve Pages <hpages at fhcrc.org> wrote:
>> Tom McCallum wrote:
>>> Hi Everyone,
>>>
>>> I have a question about for loops.  If you have something like:
>>>
>>> f <- function(x) {
>>>       y <- rep(NA,10);
>>>       for( i in 1:10 ) {
>>>               if ( i > 3 ) {
>>>                       if ( is.na(y[i-3]) == FALSE ) {
>>>                               # some calculation F which depends on one or more of the previously
>>> generated values in the series
>>>                               y[i] = y[i-1]+x[i];
>>>                       } else {
>>>                               y[i] <- x[i];
>>>                       }
>>>               }
>>>       }
>>>       y
>>> }
>>>
>>> e.g.
>>>
>>>> f(c(1,2,3,4,5,6,7,8,9,10,11,12));
>>>   [1] NA NA NA  4  5  6 13 21 30 40
>>>
>>> is there a faster way to process this than with a 'for' loop?  I have
>>> looked at lapply as well but I have read that lapply is no faster than a
>>> for loop and for my particular application it is easier to use a for loop.
>>> Also I have seen 'rle' which I think may help me but am not sure as I have
>>> only just come across it, any ideas?
>> Hi Tom,
>>
>> In the general case, you need a loop in order to propagate calculations
>> and their results across a vector.
>>
>> In _your_ particular case however, it seems that all you are doing is a
>> cumulative sum on x (at least this is what's happening for i >= 6).
>> So you could do:
>>
>> f2 <- function(x)
>> {
>>     offset <- 3
>>     start_propagate_at <- 6
>>     y_length <- 10
>>     init_range <- (offset+1):start_propagate_at
>>     y <- rep(NA, offset)
>>     y[init_range] <- x[init_range]
>>     y[start_propagate_at:y_length] <- cumsum(x[start_propagate_at:y_length])
>>     y
>> }
>>
>> and it will return the same thing as your function 'f' (at least when 'x' doesn't
>> contain NAs) but it's not faster :-/
>>
>> IMO, using sapply for propagating calculations across a vector is not appropriate
>> because:
>>
>>   (1) It requires special care. For example, this:
>>
>>         > x <- 1:10
>>         > sapply(2:length(x), function(i) {x[i] <- x[i-1]+x[i]})
>>
>>       doesn't work because the 'x' symbol on the left side of the <- in the
>>       anonymous function doesn't refer to the 'x' symbol defined in the global
>>       environment. So you need to use tricks like this:
>>
>>         > sapply(2:length(x),
>>                  function(i) {x[i] <- x[i-1]+x[i]; assign("x", x, envir=.GlobalEnv); x[i]})
>>
>>   (2) Because of this kind of tricks, then it is _very_ slow (about 10 times
>>       slower or more than a 'for' loop).
>>
>> Cheers,
>> H.
>>
>>
>>> Many thanks
>>>
>>> Tom
>>>
>>>
>>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 
>



More information about the R-devel mailing list