[R] Subsetting a data frame by a factor, using the level that occurs the most times
Douglas Bates
bates at stat.wisc.edu
Thu Jan 20 18:16:50 CET 2005
Liaw, Andy wrote:
>>From: Douglas Bates
>>
>>michael watson (IAH-C) wrote:
>>
>>>I think that title makes sense... I hope it does...
>>>
>>>I have a data frame, one of the columns of which is a
>>
>>factor. I want
>>
>>>the rows of data that correspond to the level in that factor which
>>>occurs the most times.
>>
>>So first you want to determine the mode (in the sense of the most
>>frequently occuring value) of the factor. One way to do this is
>>
>>names(which.max(table(fac)))
>>
>>Use this comparison for the subset as
>>
>>subset(data, pattern == names(which.max(table(pattern))))
>
>
> Just be careful that if there are ties (i.e., more than one level having the
> max) which.max() will randomly pick one of them. That may or may not be
> what's desired. If that is a possibility, Mick will need to think what he
> wants in such cases.
According to the documentation it picks the first one. Also, that's
what Martin Maechler told me and he wrote the code so I trust him on
that. I figure that if you have to trust someone to be meticulous and
precise then a German-speaking Swiss is a good choice.
More information about the R-help
mailing list