[Rd] Simple performance enhancement for ave
Hadley Wickham
hadley at rice.edu
Wed May 5 18:50:40 CEST 2010
n<-100000
grp1<-sample(1:750, n, replace=T)
grp2<-sample(1:750, n, replace=T)
d<-data.frame(x=rnorm(n), y=rnorm(n), grp1=grp1, grp2=grp2)
system.time(ave(d$x, d$grp1, d$grp2, FUN = mean))
# user system elapsed
# 19.840 0.125 19.967
system.time(ave(d$x, d$grp1, d$grp2, drop = TRUE, FUN = mean))
# user system elapsed
# 2.898 0.058 2.956
This is a pathological example (100,000 observations with around
90,000 groups out of ~500,000 possible), but I don't see any reason
why drop = TRUE shouldn't be the default inside ave.
Hadley
--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
More information about the R-devel
mailing list