[R] Speed up or alternative to 'For' loop

Rui Barradas ruipbarradas at sapo.pt
Mon Jun 10 23:24:17 CEST 2013


Hello,

One way to speed it up is to use a matrix instead of a data.frame. Since 
data.frames can hold data of all classes, the access to their elements 
is slow. And your data is all numeric so it can be hold in a matrix. The 
second way below gave me a speed up by a factor of 50.


system.time({
for (i in 2:nrow(df))
  {if(df$TreeID[i]==df$TreeID[i-1])
   {df$HeightGrowth[i] <- df$Height[i]-df$Height[i-1]
   }
  }
})

system.time({
df2 <- data.matrix(df)
for(i in seq_len(nrow(df2))[-1]){
	if(df2[i, "TreeID"] == df2[i - 1, "TreeID"])
		df2[i, "HeightGrowth"] <- df2[i, "Height"] - df2[i - 1, "Height"]
}
})

all.equal(df, as.data.frame(df2))  # TRUE


Hope this helps,

Rui Barradas

Em 10-06-2013 18:28, Trevor Walker escreveu:
> I have a For loop that is quite slow and am wondering if there is a faster
> option:
>
> df <- data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
> df$Height <- exp(-0.1 + 0.2*df$Age)
> df$HeightGrowth <- NA   #intialize with NA
> for (i in 2:nrow(df))
>   {if(df$TreeID[i]==df$TreeID[i-1])
>    {df$HeightGrowth[i] <- df$Height[i]-df$Height[i-1]
>    }
>   }
>
> Trevor Walker
> Email: trevordaviswalker at gmail.com
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list