[R] Model matrix with redundant columns included
Marc Schwartz
marc_schwartz at comcast.net
Wed Mar 14 05:47:06 CET 2007
On Wed, 2007-03-14 at 14:57 +1100, Hong Ooi wrote:
> Hello,
>
> Normally when you call model.matrix, you get a matrix that has
> aliased/redundant columns deleted. For example:
>
> > m <- expand.grid(a=factor(1:3), b=factor(1:3))
> > model.matrix(~a + b, m)
> (Intercept) a2 a3 b2 b3
> 1 1 0 0 0 0
> 2 1 1 0 0 0
> 3 1 0 1 0 0
> 4 1 0 0 1 0
> 5 1 1 0 1 0
> 6 1 0 1 1 0
> 7 1 0 0 0 1
> 8 1 1 0 0 1
> 9 1 0 1 0 1
> attr(,"assign")
> [1] 0 1 1 2 2
> attr(,"contrasts")
> attr(,"contrasts")$a
> [1] "contr.treatment"
>
> attr(,"contrasts")$b
> [1] "contr.treatment"
>
> The result is a matrix with 5 columns including the intercept.
>
> However, for my purposes I need a matrix that includes all columns,
> including those that would normally be redundant. Is there any way to do
> this? For the example, this would be something like
>
> a1 a2 a3 b1 b2 b3
> 1 1 0 0 1 0 0
> 2 0 1 0 1 0 0
> 3 0 0 1 1 0 0
> 4 1 0 0 0 1 0
> 5 0 1 0 0 1 0
> 6 0 0 1 0 1 0
> 7 1 0 0 0 0 1
> 8 0 1 0 0 0 1
> 9 0 0 1 0 0 1
>
> Including -1 as part of the model formula removes the intercept and adds
> the column for the base level of the first variable, but not the rest.
>
> Thanks,
There may be a better way, but this seems to work:
> m
a b
1 1 1
2 2 1
3 3 1
4 1 2
5 2 2
6 3 2
7 1 3
8 2 3
9 3 3
MAT <- do.call("cbind", lapply(m, function(x) model.matrix(~ x - 1)))
> MAT
x1 x2 x3 x1 x2 x3
1 1 0 0 1 0 0
2 0 1 0 1 0 0
3 0 0 1 1 0 0
4 1 0 0 0 1 0
5 0 1 0 0 1 0
6 0 0 1 0 1 0
7 1 0 0 0 0 1
8 0 1 0 0 0 1
9 0 0 1 0 0 1
colnames(MAT) <- names(unlist(lapply(m, levels)))
> MAT
a1 a2 a3 b1 b2 b3
1 1 0 0 1 0 0
2 0 1 0 1 0 0
3 0 0 1 1 0 0
4 1 0 0 0 1 0
5 0 1 0 0 1 0
6 0 0 1 0 1 0
7 1 0 0 0 0 1
8 0 1 0 0 0 1
9 0 0 1 0 0 1
You can cbind() the (Intercept) column back in if you require.
HTH,
Marc Schwartz
More information about the R-help
mailing list