[R] removing factor level represented by less than x rows
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Fri Jul 8 17:16:23 CEST 2005
Mikkel Grum wrote:
> In a number of different situations I'm trying to
> remove factor levels that are represented by less than
> a certain number of rows, e.g. if I had the dataset aa
> below and wanted to remove the species that are
> represented in less than 2 rows:
>
> data(iris)
> aa <- iris[1:101,]
>
> In this case, since I can see that the species
> virginica only has one row, I can write:
>
> table(aa$Species)
> setosa versicolor virginica
> 50 50 1
>
> aa[aa$Species != "virginica", ]
>
> but:
>
> aa[aa$Species == names(table(aa$Species)> 2),]
>
> does not work.
>
> This must be a fairly common task with a straight
> forward solution that I can't see. Any ideas?
>
> Best wishes,
> Mikkel
library(Hmisc)
?combine.levels
This doesn't remove levels but combines infrequent ones though.
Frank
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list