[R] Collapse factor levels

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sun Nov 1 22:25:48 CET 2009


Kevin E. Thorpe wrote:
> I'm sure this is simple enough, but an R site search on my subject
> terms did suggest a solution.  I have a numeric vector with many
> values that I wish to create a factor from having only a few levels.
> Here is a toy example.
> 
>  > x <- 1:10
>  > x <- 
> factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C"))
>  > x
>  [1] A A A B B B C C C C
> Levels: A A A B B B C C C C
>  > summary(x)
> A A A B B B C C C C
> 3 0 0 3 0 0 4 0 0 0
> 
> So, there are clearly still 10 underlying levels.  The results I would
> like to see from printing the value and summary(x) are:
> 
>  > x
>  [1] A A A B B B C C C C
> Levels: A B C
>  > summary(x)
> A B C
> 3 3 4
> 
> Hopefully this makes sense.
> 
> Thanks,
> 
> Kevin
> 

It's an anomaly inherited frokm S-PLUS (or so I have been told). 
Actually, with the current R, you should get a warning:

 > x <- 1:10
 > x <- 
factor(x,levels=1:10,labels=c("A","A","A","B","B","B","C","C","C","C"))
Warning message:
In `levels<-`(`*tmp*`, value = c("A", "A", "A", "B", "B", "B", "C",  :
   duplicated levels will not be allowed in factors anymore

This works (as documented on the help page for levels!):

 > x <- 1:10
 > x <- factor(x,levels=1:10)
 > levels(x) <- c("A","A","A","B","B","B","C","C","C","C")
 > table(x)
x
A B C
3 3 4


-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-help mailing list