[R] somewhat ineffective suppressing intercepts
Ross Boylan
ross at biostat.ucsf.edu
Fri Sep 27 05:36:26 CEST 2013
Suppressing the intercept and contr.sum coding are not quite working as
I expect:
> mf <- data.frame(A=C(factor(c("a", "b", "c")), contr.sum))
> mm <- model.matrix(~0+A, data=mf)
> mm
Aa Ab Ac
1 1 0 0
2 0 1 0
3 0 0 1
What I expect (and want) is
A1 A2
1 1 0
2 0 1
3 1 1
When I do more complicated models every term except the first one is
coded as expected. That includes A itself if interacted with other
variables.
It seems R has decided the model really needs an intercept and is
throwing in an extra level for the first factor to assure that I get it,
even though I said with the "0" that I didn't want it.
BTW, ~A produces an intercept and the two columns expected above. But I
don't want the intercept; the model matrix is going into a multinomial
model for which the intercept is not identified (since all intercepts
produce the same predicted probabilities).
What's going on here?
R 2.15.1
P.S. I think the above stripped down example illustrates the problem,
but here's a more expanded model:
> mf <- expand.grid(C(factor(c("a", "b", "c")), contr.sum),
+ C(factor(c("f", "t")), contr.sum))
> colnames(mf) <- c("A", "H")
> mf$x <- seq(6)
> mf
A H x
1 a f 1
2 b f 2
3 c f 3
4 a t 4
5 b t 5
6 c t 6
> myformula <- ~0+A*H*x
> mm <- model.matrix(myformula, data=mf)
> mm
Aa Ab Ac H1 x A1:H1 A2:H1 A1:x A2:x H1:x A1:H1:x A2:H1:x
1 1 0 0 1 1 1 0 1 0 1 1 0
2 0 1 0 1 2 0 1 0 2 2 0 2
3 0 0 1 1 3 -1 -1 -3 -3 3 -3 -3
4 1 0 0 -1 4 -1 0 4 0 -4 -4 0
5 0 1 0 -1 5 0 -1 0 5 -5 0 -5
6 0 0 1 -1 6 1 1 -6 -6 -6 6 6
More information about the R-help
mailing list