[R] How to get the most frequent value of the subgroup
David Winsemius
dwinsemius at comcast.net
Fri Mar 30 17:39:43 CEST 2012
On Mar 30, 2012, at 3:38 AM, Milan Bouchet-Valat wrote:
> Le jeudi 29 mars 2012 à 09:49 -0500, Yongsuhk Jung a écrit :
>> Dear Members of the R-Help,
>>
>>
>>
>> While using a R function - 'aggregate' that you developed, I become
>> to have
>> a question.
>>
>> In that function,
>>
>>
>>
>>> aggregate(x, by, FUN, ..., simplify = TRUE)
>>
>>
>>
>> I was wondering about what type of FUN I should write if I want to
>> get "the
>> most frequent value of the subgroup" as a summary statistics of the
>> subgroups.
>>
>> I will appreciate if I can get your idea on this issue.
> It would have been better if you had provided a sample data as asked
> by
> the posting guide.
How TRUE.
>
> Anyway, here's a possibility:
>> df <- data.frame(a=rep(1:3, 2), b=c(1, 2, 2, 1, 1, 2))
>> df
> a b
> 1 1 1
> 2 2 2
> 3 3 2
> 4 1 1
> 5 2 1
> 6 3 2
>> aggregate(df$a, list(df$b), function(x) max(table(x)))
> Group.1 x
> 1 1 2
> 2 2 2
Prompted by the obvious error in that solution (since the mode of b==1
is 1 and the mode of b==2 is 3) I thought I would take my untested
code strategy and fix it as well, now that an example was "on the
table" for discussion:
> aggregate(df1[1], by=df1[2], FUN=function(x){ tbl <- table(x);
return( dimnames(tbl)[[1]][ which.max(tbl)] )
} )
b a
1 1 1
2 2 3
( The modal values are in the "a" column.)
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list