[R] Subsetting by number of observations in a factor

Fri Aug 10 06:07:04 CEST 2007

Here is an even faster way:

> # faster way
> x.mg.size <- table(x$mg)  # count occurance
> x.mg.5 <- names(x.mg.size)[x.mg.size > 5]  # select greater than 5
> x.new1 <- subset(x, x$mg %in% x.mg.5)  # use in the subset
> x.new1
   mg data
1   A    1
4   A    4
5   D    5
6   D    6
7   A    7
8   D    8
12  A   12
13  D   13
14  A   14
16  D   16
17  D   17
18  A   18
20  A   20

On 8/9/07, Ron Crump <ron.crump at une.edu.au> wrote:
> Jim,
>
> > Does this do what you want?  It creates a new dataframe with those
> > 'mg' that have at least a certain number of observation.
>
> Looks good. I also have an alternative solution which appears to work,
> so I'll see which is quicker on the big data set in question.
>
> My solution:
>
> mgsize <- as.data.frame(table(in$mg))
> in2 <- merge(in,mgsize,by.x="mg",by.y="Var1")
> out <- subset(in2, Freq > 1, select= -Freq)
>
> Thanks for your help.
>
> Ron.
>
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?