[R] Subsetting by number of observations in a factor

Ron Crump ron.crump at une.edu.au
Fri Aug 10 02:23:05 CEST 2007


I generally do my data preparation externally to R, so I
this is a bit unfamiliar to me, but a colleague has asked
me how to do certain data manipulations within R.

Anyway, basically I can get his large file into a dataframe.
One of the columns is a management group code (mg). There may be
varying numbers of observations per management group, and
he would like to subset the dataframe such that there are
always at least n per management group.

I presume I can get to this using table or tapply, then
(and I'm not sure how on this bit) creating a column nmg
containing the number of observations that corresponds to
mg for that row, then simply subsetting.

So, am I on the right track? If so how do I actually do it, and
is there an easier method than I am considering.

Thanks for your help,

More information about the R-help mailing list