[R] mean for subset

Duncan Murdoch murdoch at stats.uwo.ca
Tue Jan 5 20:11:36 CET 2010


On 05/01/2010 1:29 PM, Geoffrey Smith wrote:
> Hello, does anyone know how to take the mean for a subset of observations?
> For example, suppose my data looks like this:
>
> OBS     NAME   SCORE
> 1          Tom       92
> 2          Tom       88
> 3          Tom       56
> 4          James    85
> 5          James    75
> 6          James    32
> 7          Dawn     56
> 8          Dawn     91
> 9          Clara     95
> 10        Clara     84
>
> Is there a way to get the mean of the SCORE variable by NAME but only when
> the number of observations is equal to 3?  In other words, is there a way to
> get the mean of the SCORE variable for Tom and James, but not for Dawn and
> Clara?  Thank you.
>   

You probably want to do it in two steps:  first, find which names have 3 
observations, and take that subset of the dataset; then do the mean on 
all groups.  This is one way:

 > counts <- table(dataset$NAME)
 > keep <- names(counts)[counts == 3]
 > subset <- dataset[ dataset$NAME %in% keep,]
 > tapply(subset$SCORE, subset$NAME, mean)
   Clara     Dawn    James      Tom
      NA       NA 64.00000 78.66667

Duncan Murdoch



More information about the R-help mailing list