[R] persistance of factor levels in a data frame

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Feb 28 14:21:07 CET 2005


Lefebure Tristan <Tristan.Lefebure at univ-lyon1.fr> writes:

> Hi,
> Just something I don't understand:
> 
> data <- data.frame(V1=c(1:12),F1=c(rep("a",4),rep("b",4),rep("c",4)))
> data_ac <- data[which(data$F1 !="b"), ]  
> levels(data_ac$F1)    
> 
> Why the level "b" is always present ?

Because it is a property of the definition, not of the data. E.g. if
you tabulate it, you generally want to get a zero entry if there are
no "b"s in the data. If, for some reason, you want to reduce the
factor to only those levels that are present, factor() gets you there
soon enough:

>  levels(factor(data_ac$F1))
[1] "a" "c"


-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list