[R] Extracting repeated observations from a large data set

J.Brian.Adams J.Brian.Adams
Sat Dec 30 21:19:57 CET 2000

I have a dataset containing over 750,000 observations.  I have read them
into an nx6 matrix.  If possible I would like to prune it by extracting
only those observations in which a specific characteristic that is
contained in column j appears at least k times.  I have used the
following where k=3 and the fifth column contains the test data

			ObsMatrix[as.numeric(table(ObsMatrix[,5])) > 3,]

but it does not seem to work.  It returns certain rows from the matrix,
but not necessarily those with more than three repeats, and it only
returns one row for each match.  I need to be able to keep all of the
duplicate records in the data.  Is there a way to do this without using
several nested for loops?

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list