[R] adjusting "levels" after subset a table

Marc Schwartz marc_schwartz at comcast.net
Sun Dec 9 04:48:49 CET 2007


On Sat, 2007-12-08 at 19:26 -0800, Milton Cezar Ribeiro wrote:
> Dear all,
> 
> I have a data.frame with a factor collumn with about 10 levels.
> After extract a subset of this data.frame, by selecting 2 of my 10
> levels, the new data.frame continue with original number of levels.
> How can I adjust it in a manner that when I try levels(my.df) I
> receive the actualyzed number of levels?
> 
> By the way, I read my file using reab.table.
> 
> I tryed solve it with :  levels(my.df$my.var)<-unique(my.df$my.var)
> but the problem remain.
> 
> Many thanks,
> 
> miltinho
> Brazil

The default when subsetting factors (which happens when you subset a
data frame) is to retain the original set of levels, even if they don't
occur in the resultant subset. This is described in ?"[.factor" where
the 'drop' argument is FALSE by default.

To subset the factor and only retain levels for those values that are
still present, you can use:

  MyFactor <- factor(MyFactor)

or

  MyFactor <- MyFactor[, drop = TRUE]

after subsetting the data frame.

There is also a page in the R Wiki that describes some additional
approaches:

http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels

HTH,

Marc Schwartz



More information about the R-help mailing list