[R] subset grouped data with quantile and NA's
David Carslaw
d.c.carslaw at its.leeds.ac.uk
Fri Aug 22 09:35:29 CEST 2008
I can't quite seem to solve a problem subsetting a data frame. Here's a
reproducible example.
Given a data frame:
dat <- data.frame(fac = rep(c("a", "b"), each = 100),
value = c(rnorm(130), rep(NA, 70)),
other = rnorm(200))
What I want is a new data frame (with the same columns as dat) excluding the
top 5% of "value" separately by "a" and "b". For example, this produces the
results I'm after in an array:
sub <- tapply(dat$value, dat$fac, function(x) x[x < quantile(x, probs =
0.95, na.rm = TRUE)])
My difficulty is putting them into a data frame along with the other columns
"fac" and "other". Note that quantile will return different length vectors
due to different numbers of NAs for a and b.
There's something I'm just not seeing - can you help?
Many thanks.
David Carslaw
-----
Institute for Transport Studies
University of Leeds
--
View this message in context: http://www.nabble.com/subset-grouped-data-with-quantile-and-NA%27s-tp19102795p19102795.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list