[R] drop unused levels in subset.data.frame

baptiste auguie baptiste.auguie at googlemail.com
Tue Nov 10 17:26:00 CET 2009


Neat, I reinvented the wheel! Would that seem like a useful example at
the end of the help page for ?subset ? (it currently has very little
to say about drop).

Thanks also to David for the alternative idea.

Best regards,

baptiste


2009/11/10 Marc Schwartz <marc_schwartz at me.com>:
> On Nov 10, 2009, at 9:49 AM, baptiste auguie wrote:
>
>> Dear list,
>>
>> subset has a 'drop' argument that I had often mistaken for the one in
>> [.factor which removes unused levels.
>> Clearly it doesn't work that way, as shown below,
>>
>> d <- data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]))
>> s <- subset(d, y=="A", drop=TRUE)
>> str(s)
>> 'data.frame':   5 obs. of  2 variables:
>> $ x: Factor w/ 15 levels "a","b","c","d",..: 1 4 7 10 13
>> $ y: Factor w/ 3 levels "A","B","C": 1 1 1 1 1
>>
>> The subset still retains all the unused factor levels. I wonder how
>> people usually get rid of all unused levels in a data.frame after
>> subsetting? I came up with this but I may have missed a better
>> built-in solution,
>>
>> dropit <- function (d, columns = names(d), ...)
>> {
>>   d[columns] = lapply(d[columns], "[", drop=TRUE, ...)
>>   d
>> }
>>
>> str(dropit(s))
>> 'data.frame':   5 obs. of  2 variables:
>> $ x: Factor w/ 5 levels "a","d","g","j",..: 1 2 3 4 5
>> $ y: Factor w/ 1 level "A": 1 1 1 1 1
>
> There is a page in the R wiki here:
>
>  http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels
>
> that has some approaches.
>
> HTH,
>
> Marc Schwartz
>
>




More information about the R-help mailing list