[R] mean for subset
Gabor Grothendieck
ggrothendieck at gmail.com
Tue Jan 5 20:22:04 CET 2010
Here is the solution using sqldf which can do it in one statement:
> # read in data
> Lines <- "OBS NAME SCORE
+ 1 Tom 92
+ 2 Tom 88
+ 3 Tom 56
+ 4 James 85
+ 5 James 75
+ 6 James 32
+ 7 Dawn 56
+ 8 Dawn 91
+ 9 Clara 95
+ 10 Clara 84"
>
> DF <- read.table(textConnection(Lines), header = TRUE)
>
> # run
> library(sqldf)
> sqldf("select NAME, avg(SCORE) from DF group by NAME having count(*) = 3")
NAME avg(SCORE)
1 James 64.00000
2 Tom 78.66667
On Tue, Jan 5, 2010 at 2:03 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> Have a look at this post and the rest of that thread:
>
> https://stat.ethz.ch/pipermail/r-help/2010-January/223420.html
>
> On Tue, Jan 5, 2010 at 1:29 PM, Geoffrey Smith <gps at asu.edu> wrote:
>> Hello, does anyone know how to take the mean for a subset of observations?
>> For example, suppose my data looks like this:
>>
>> OBS NAME SCORE
>> 1 Tom 92
>> 2 Tom 88
>> 3 Tom 56
>> 4 James 85
>> 5 James 75
>> 6 James 32
>> 7 Dawn 56
>> 8 Dawn 91
>> 9 Clara 95
>> 10 Clara 84
>>
>> Is there a way to get the mean of the SCORE variable by NAME but only when
>> the number of observations is equal to 3? In other words, is there a way to
>> get the mean of the SCORE variable for Tom and James, but not for Dawn and
>> Clara? Thank you.
>>
>> --
>> Geoffrey Smith
>> Visiting Assistant Professor
>> Department of Finance
>> W. P. Carey School of Business
>> Arizona State University
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
More information about the R-help
mailing list