[R] aggregating columns in a data frame in different ways
Gabor Grothendieck
ggrothendieck at gmail.com
Sat Apr 29 00:57:11 CEST 2006
Here are three possibilities:
1. aggregate on the columns that you want to sum and aggregate on
the columns that you want to average and then merge them:
By <- A[, 2, drop = FALSE]
merge(aggregate(A[, 3, drop = FALSE], By, sum),
aggregate(A[, 4, drop = FALSE], By, mean))
2. use by:
f <- function(x) with(x, c(count = sum(count), value = mean(value)))
do.call("rbind", by(A[, 3:4], A[, 2, drop = FALSE], f))
3. use summaryBy in the doBy package picking off the appropriate
columns in the output:
library(doBy)
summaryBy(. ~ type, A[, -1], FUN = c(sum, mean))[, c(1, 2, 5)]
On 4/28/06, kavaumail-r at yahoo.com <kavaumail-r at yahoo.com> wrote:
> I would like to use aggregate() to combine statistics
> for several days in a data frame. My data frame looks
> similar to this:
>
> date type count value
> 1 2006-04-01 A 10 99.6
> 2 2006-04-01 B 4 33.2
> 3 2006-04-02 A 22 43.2
> 4 2006-04-02 B 8 44.9
> 5 2006-04-03 A 12 12.4
> 6 2006-04-03 B 14 18.5
>
> ('date' is a factor, and my actual data frame has
> about 100 different 'types', not just two)
>
> I would like to sum up the 'counts' per 'type', and
> get an average of the 'values' per 'type'. In other
> words, I would like my results to look like this:
>
> type count value
> 1 A 44 51.73333
> 2 B 26 32.2
>
> The way I'm doing this now is to tear the table apart
> into its individual columns, then apply aggregate() to
> each column individually (using the 'type' column for
> the 'by' parameter), and finally putting everything
> back together, like this:
>
> > A.count = aggregate(A$count, list(type=A$type), sum)
> > A.value = aggregate(A$value, list(type=A$type),
> mean)
> > B = data.frame(type=A.count$type, count=A.count$x,
> value=A.value$x)
>
> My actual table is a bit more involved than in this
> simple example, however, so this becomes quite
> tedious.
>
> I am hoping that there is a simpler way for doing
> this, for example by providing different FUN
> parameters for each column to the aggregate()
> function.
>
> I would appreciate any suggestions.
> Thanks
> Klaus
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list