[R] how to convert multiple dummy variables to 1 factor variable?

Wensui Liu liuwensui at gmail.com
Sun Oct 22 21:25:02 CEST 2006


Great!

It seems everyone is having fun with R in the weekend afternoon.

Thank you so much, Marc and Peter.


On 10/22/06, Marc Schwartz <MSchwartz at mn.rr.com> wrote:
> On Sun, 2006-10-22 at 14:03 -0500, Marc Schwartz wrote:
> > On Sun, 2006-10-22 at 14:37 -0400, Wensui Liu wrote:
> > > Thank you so much, Marc and Peter,
> > >
> > > Your method works great if I want to convert N dummies into N-level
> > > factor. But what if I want to convert N dummies into (N+1)-level
> > > factor? I tried both ways but none  works.
> > >
> > > Again, thank you so much!
> >
> >
> > I presume that you are referring to the situation where the base level
> > of the factor is not present as a column in the matrix, such that all of
> > the columns would be 0 in the case where the base level is present. This
> > would be the typical result of model.matrix() with default Treatment
> > contrasts.
> >
> > In that situation, we would have a matrix as follows:
> >
> > > mat
> >      Level2 Level3 Level4 Level5
> > [1,]      0      0      0      0
> > [2,]      1      0      0      0
> > [3,]      0      1      0      0
> > [4,]      0      0      1      0
> > [5,]      0      1      0      0
> > [6,]      0      0      0      0
> > [7,]      0      0      0      1
> >
> > Note that now, we do not have a 'Level1' column.
> >
> > Thus, rows 1 and 6 are all 0's, indicating that "Level1" is present.
> >
> > Taking Peter's more efficient approach of using matrix multiplication,
> > and expanding upon it:
> >
> > > factor((mat %*% (1:ncol(mat))) + 1,
> >          labels = c("Level1", colnames(mat)))
> > [1] Level1 Level2 Level3 Level4 Level3 Level1 Level5
> > Levels: Level1 Level2 Level3 Level4 Level5
>
> Actually, I was wrong in the numeric to factor conversion. The addition
> of 1 is really not needed. We just need to be sure that there are 5
> labels, one more than the number of columns:
>
> > factor(mat %*% 1:ncol(mat),
>          labels = c("Level1", colnames(mat)))
> [1] Level1 Level2 Level3 Level4 Level3 Level1 Level5
> Levels: Level1 Level2 Level3 Level4 Level5
>
> HTH,
>
> Marc
>
>
>


-- 
WenSui Liu
(http://spaces.msn.com/statcompute/blog)
Senior Decision Support Analyst
Cincinnati Children Hospital Medical Center



More information about the R-help mailing list