[R] How to average subgroups in a dataframe? (not sure how to apply aggregate(..))

Tony Breyal tony.breyal at googlemail.com
Wed Oct 21 16:23:04 CEST 2009


Thank you all for your responses, i have now achieved the desired
output for my own real data using your suggestions. I will also have
to look into this 'plyr' package as i have noticed that it gets
mentioned a lot.


On 21 Oct, 13:33, Karl Ove Hufthammer <k... at huftis.org> wrote:
> In article <800ACFC0-2C3C-41F1-AF18-3B52F7E43... at jhsph.edu>,
> bcarv... at jhsph.edu says...
>
> > aves = aggregate(df1$score, by=list(col1=df1$col1, col2=df1$col2), mean)
> > results = merge(df1, aves)
>
> Or, with the 'plyr' package, which has a very nice syntax:
>
> library(plyr)
> ddply(df1, .(col1, col2), transform, Average=mean(score))
>
> It may be a bit slow for very large datasets, though.
>
> Here's an alternative, which will be as fast as the aggregate solution.
>
> within(df1, { Average=ave(score, col1, col2, FUN=mean) } )
>
> Which one you use is a matter of taste.
>
> And of course, the 'within' function is not the important part here;
> 'ave' is. For example, if you have attached your data frame, you just
> have to type
>
> Average=ave(score, col1, col2, FUN=mean)
>
> --
> Karl Ove Hufthammer
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list