[R] mean for subset

Achim Zeileis Achim.Zeileis at wu.ac.at
Tue Jan 5 20:09:42 CET 2010


On Tue, 5 Jan 2010, Geoffrey Smith wrote:

> Hello, does anyone know how to take the mean for a subset of observations?
> For example, suppose my data looks like this:
>
> OBS     NAME   SCORE
> 1          Tom       92
> 2          Tom       88
> 3          Tom       56
> 4          James    85
> 5          James    75
> 6          James    32
> 7          Dawn     56
> 8          Dawn     91
> 9          Clara     95
> 10        Clara     84
>
> Is there a way to get the mean of the SCORE variable by NAME but only when
> the number of observations is equal to 3?  In other words, is there a way to
> get the mean of the SCORE variable for Tom and James, but not for Dawn and
> Clara?  Thank you.

You can use tapply() together with a custom function that returns NA if 
the condition is not satisfied, e.g.

## read data
dat <- read.table(textConnection("
OBS     NAME   SCORE
1          Tom       92
2          Tom       88
3          Tom       56
4          James    85
5          James    75
6          James    32
7          Dawn     56
8          Dawn     91
9          Clara     95
10        Clara     84
"), header = TRUE)

## use tapply() with custom function
with(dat,
   tapply(SCORE, NAME, function(x) if(length(x) == 3) mean(x) else NA)
)

Alternatively you could look at

mymean <-   with(dat, tapply(SCORE, NAME, mean))
mylength <- with(dat, tapply(SCORE, NAME, length))
mymean[mylength == 3]

etc.

hth,
Z

> -- 
> Geoffrey Smith
> Visiting Assistant Professor
> Department of Finance
> W. P. Carey School of Business
> Arizona State University
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list