[R] drop unused levels in subset.data.frame
baptiste auguie
baptiste.auguie at googlemail.com
Tue Nov 10 16:49:30 CET 2009
Dear list,
subset has a 'drop' argument that I had often mistaken for the one in
[.factor which removes unused levels.
Clearly it doesn't work that way, as shown below,
d <- data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]))
s <- subset(d, y=="A", drop=TRUE)
str(s)
'data.frame': 5 obs. of 2 variables:
$ x: Factor w/ 15 levels "a","b","c","d",..: 1 4 7 10 13
$ y: Factor w/ 3 levels "A","B","C": 1 1 1 1 1
The subset still retains all the unused factor levels. I wonder how
people usually get rid of all unused levels in a data.frame after
subsetting? I came up with this but I may have missed a better
built-in solution,
dropit <- function (d, columns = names(d), ...)
{
d[columns] = lapply(d[columns], "[", drop=TRUE, ...)
d
}
str(dropit(s))
'data.frame': 5 obs. of 2 variables:
$ x: Factor w/ 5 levels "a","d","g","j",..: 1 2 3 4 5
$ y: Factor w/ 1 level "A": 1 1 1 1 1
Best regards,
baptiste
More information about the R-help
mailing list