[R] Speed up or alternative to 'For' loop

David Winsemius dwinsemius at comcast.net
Tue Jun 11 00:26:51 CEST 2013


On Jun 10, 2013, at 10:28 AM, Trevor Walker wrote:

> I have a For loop that is quite slow and am wondering if there is a faster
> option:
> 
> df <- data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
> df$Height <- exp(-0.1 + 0.2*df$Age)
> df$HeightGrowth <- NA   #intialize with NA
> for (i in 2:nrow(df))
> {if(df$TreeID[i]==df$TreeID[i-1])
>  {df$HeightGrowth[i] <- df$Height[i]-df$Height[i-1]
>  }
> }
> 
Ivoid tests with if(){}e;se(). Use vectorized code, possibly with 'ifelse' but in this case you need a function that does calcualtions within groups.

The ave() function with diff() will do it compactly and efficiently:

> df <- data.frame(TreeID=rep(1:5,each=4), Age=rep(seq(1,4,1),5))
> df$Height <- exp(-0.1 + 0.2*df$Age)
> df$HeightGrowth <- NA   #intialize with NA

> df$HeightGrowth <- ave(df$Height, df$TreeID, FUN= function(vec) c(NA, diff(vec)))
> df
   TreeID Age   Height HeightGrowth
1       1   1 1.105171           NA
2       1   2 1.349859    0.2446879
3       1   3 1.648721    0.2988625
4       1   4 2.013753    0.3650314
5       2   1 1.105171           NA
6       2   2 1.349859    0.2446879
7       2   3 1.648721    0.2988625
8       2   4 2.013753    0.3650314
9       3   1 1.105171           NA
10      3   2 1.349859    0.2446879
11      3   3 1.648721    0.2988625
12      3   4 2.013753    0.3650314
13      4   1 1.105171           NA
14      4   2 1.349859    0.2446879
15      4   3 1.648721    0.2988625
16      4   4 2.013753    0.3650314
17      5   1 1.105171           NA
18      5   2 1.349859    0.2446879
19      5   3 1.648721    0.2988625
20      5   4 2.013753    0.3650314

(On my machine it was over six times as fast as the if-based code from Arun. )

-- 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list