[R] Calculate weighted mean for each group
Marc Schwartz
marc_schwartz at me.com
Thu Jul 23 23:48:04 CEST 2009
On Jul 23, 2009, at 4:18 PM, Alexis Maluendas wrote:
> Hi R experts,
>
> I need know how calculate a weighted mean by group in a data frame.
> I have
> tried with aggragate() function:
>
> data.frame(x=c(15,12,3,10,10),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))
> -> d
> aggregate(d$x,by=list(d$g),weighted.mean,w=d$w)
>
> Generating the following error:
>
> Error en FUN(X[[1L]], ...) : 'x' and 'w' must have the same length
>
> Thanks in advance
Did you not notice the error message when creating the data frame:
> d <-
data.frame(x=c(15,12,3,10,10),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))
Error in data.frame(x = c(15, 12, 3, 10, 10), g = c(1, 1, 1, 2, 2, 3, :
arguments imply differing number of rows: 5, 7
You have 5 elements in 'x' and 7 in each of 'g' and 'w'...
In addition, you are passing all 7 elements in d$w to each of the
subsets created by d$g, hence you are getting the aggregate() error
message.
This is one of those cases where you may be better served by using
split() directly to break up the data frame into groups and then use
sapply() over the subsets:
# I am adding data here to create the data frame
d <-
data
.frame(x=c(15,12,3,10,10,12,12),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))
> d
x g w
1 15 1 2
2 12 1 3
3 3 1 1
4 10 2 5
5 10 2 5
6 12 3 2
7 12 3 5
> split(d, d$g)
$`1`
x g w
1 15 1 2
2 12 1 3
3 3 1 1
$`2`
x g w
4 10 2 5
5 10 2 5
$`3`
x g w
6 12 3 2
7 12 3 5
> sapply(split(d, d$g), function(x) weighted.mean(x$x, w = x$w))
1 2 3
11.5 10.0 12.0
See ?split, which is used by tapply(), which in turn is used in
aggregate().
HTH,
Marc Schwartz
More information about the R-help
mailing list