[R] How to average subgroups in a dataframe? (not sure how to apply aggregate(..))
Karl Ove Hufthammer
karl at huftis.org
Wed Oct 21 14:33:46 CEST 2009
In article <800ACFC0-2C3C-41F1-AF18-3B52F7E43F07 at jhsph.edu>,
bcarvalh at jhsph.edu says...
> aves = aggregate(df1$score, by=list(col1=df1$col1, col2=df1$col2), mean)
> results = merge(df1, aves)
Or, with the 'plyr' package, which has a very nice syntax:
library(plyr)
ddply(df1, .(col1, col2), transform, Average=mean(score))
It may be a bit slow for very large datasets, though.
Here's an alternative, which will be as fast as the aggregate solution.
within(df1, { Average=ave(score, col1, col2, FUN=mean) } )
Which one you use is a matter of taste.
And of course, the 'within' function is not the important part here;
'ave' is. For example, if you have attached your data frame, you just
have to type
Average=ave(score, col1, col2, FUN=mean)
--
Karl Ove Hufthammer
More information about the R-help
mailing list