[R] How to average subgroups in a dataframe? (not sure how to apply aggregate(..))
Tony Breyal
tony.breyal at googlemail.com
Wed Oct 21 13:03:50 CEST 2009
Dear all,
Lets say I have the following data frame:
> set.seed(1)
> col1 <- c(rep('happy',9), rep('sad', 9))
> col2 <- rep(c(rep('alpha', 3), rep('beta', 3), rep('gamma', 3)),2)
> dates <- as.Date(rep(c('2009-10-13', '2009-10-14', '2009-10-15'),6))
> score=rnorm(18, 10, 3)
> df1<-data.frame(col1=col1, col2=col2, Date=dates, score=score)
col1 col2 Date score
1 happy alpha 2009-10-13 8.120639
2 happy alpha 2009-10-14 10.550930
3 happy alpha 2009-10-15 7.493114
4 happy beta 2009-10-13 14.785842
5 happy beta 2009-10-14 10.988523
6 happy beta 2009-10-15 7.538595
7 happy gamma 2009-10-13 11.462287
8 happy gamma 2009-10-14 12.214974
9 happy gamma 2009-10-15 11.727344
10 sad alpha 2009-10-13 9.083835
11 sad alpha 2009-10-14 14.535344
12 sad alpha 2009-10-15 11.169530
13 sad beta 2009-10-13 8.136278
14 sad beta 2009-10-14 3.355900
15 sad beta 2009-10-15 13.374793
16 sad gamma 2009-10-13 9.865199
17 sad gamma 2009-10-14 9.951429
18 sad gamma 2009-10-15 12.831509
Is it possible to get the following, whereby I am averaging the values
within each group of values in col2:
col1 col2 Date score Average
1 happy alpha 13/10/2009 8.120639 8.721561
2 happy alpha 14/10/2009 10.550930 8.721561
3 happy alpha 15/10/2009 7.493114 8.721561
4 happy beta 13/10/2009 14.785842 11.104320
5 happy beta 14/10/2009 10.988523 11.104320
6 happy beta 15/10/2009 7.538595 11.104320
7 happy gamma 13/10/2009 11.462287 11.801535
8 happy gamma 14/10/2009 12.214974 11.801535
9 happy gamma 15/10/2009 11.727344 11.801535
10 sad alpha 13/10/2009 9.083835 11.596236
11 sad alpha 14/10/2009 14.535344 11.596236
12 sad alpha 15/10/2009 11.169530 11.596236
13 sad beta 13/10/2009 8.136278 8.288990
14 sad beta 14/10/2009 3.355900 8.288990
15 sad beta 15/10/2009 13.374793 8.288990
16 sad gamma 13/10/2009 9.865199 10.882712
17 sad gamma 14/10/2009 9.951429 10.882712
18 sad gamma 15/10/2009 12.831509 10.882712
My feeling is that I should be using the ?aggregate is some fashion
but can't see how to do it. Or possibly there's another function i
should be using?
Thanks in advance,
Tony
O/S: Windows Vista Ultimate
> sessionInfo()
R version 2.9.2 (2009-08-24)
i386-pc-mingw32
locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
1252;LC_MONETARY=English_United Kingdom.
1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods
base
More information about the R-help
mailing list