[R] Calculate weighted mean for each group

Marc Schwartz marc_schwartz at me.com
Thu Jul 23 23:48:04 CEST 2009


On Jul 23, 2009, at 4:18 PM, Alexis Maluendas wrote:

> Hi R experts,
>
> I need know how calculate a weighted mean by group in a data frame.  
> I have
> tried with aggragate() function:
>
> data.frame(x=c(15,12,3,10,10),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))  
> -> d
> aggregate(d$x,by=list(d$g),weighted.mean,w=d$w)
>
> Generating the following error:
>
> Error en FUN(X[[1L]], ...) : 'x' and 'w' must have the same length
>
> Thanks in advance


Did you not notice the error message when creating the data frame:

 > d <-  
data.frame(x=c(15,12,3,10,10),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))
Error in data.frame(x = c(15, 12, 3, 10, 10), g = c(1, 1, 1, 2, 2, 3,  :
   arguments imply differing number of rows: 5, 7

You have 5 elements in 'x' and 7 in each of 'g' and 'w'...

In addition, you are passing all 7 elements in d$w to each of the  
subsets created by d$g, hence you are getting the aggregate() error  
message.

This is one of those cases where you may be better served by using  
split() directly to break up the data frame into groups and then use  
sapply() over the subsets:

# I am adding data here to create the data frame
d <-  
data 
.frame(x=c(15,12,3,10,10,12,12),g=c(1,1,1,2,2,3,3),w=c(2,3,1,5,5,2,5))

 > d
    x g w
1 15 1 2
2 12 1 3
3  3 1 1
4 10 2 5
5 10 2 5
6 12 3 2
7 12 3 5

 > split(d, d$g)
$`1`
    x g w
1 15 1 2
2 12 1 3
3  3 1 1

$`2`
    x g w
4 10 2 5
5 10 2 5

$`3`
    x g w
6 12 3 2
7 12 3 5



 > sapply(split(d, d$g), function(x) weighted.mean(x$x, w = x$w))
    1    2    3
11.5 10.0 12.0


See ?split, which is used by tapply(), which in turn is used in  
aggregate().

HTH,

Marc Schwartz




More information about the R-help mailing list