[R] Interpreting model matrix columns when using contr.sum
Gang Chen
gangchen6 at gmail.com
Fri Jan 23 23:58:03 CET 2009
With the following example using contr.sum for both factors,
> dd <- data.frame(a = gl(3,4), b = gl(4,1,12)) # balanced 2-way
> model.matrix(~ a * b, dd, contrasts = list(a="contr.sum", b="contr.sum"))
(Intercept) a1 a2 b1 b2 b3 a1:b1 a2:b1 a1:b2 a2:b2 a1:b3 a2:b3
1 1 1 0 1 0 0 1 0 0 0 0 0
2 1 1 0 0 1 0 0 0 1 0 0 0
3 1 1 0 0 0 1 0 0 0 0 1 0
4 1 1 0 -1 -1 -1 -1 0 -1 0 -1 0
5 1 0 1 1 0 0 0 1 0 0 0 0
6 1 0 1 0 1 0 0 0 0 1 0 0
7 1 0 1 0 0 1 0 0 0 0 0 1
8 1 0 1 -1 -1 -1 0 -1 0 -1 0 -1
9 1 -1 -1 1 0 0 -1 -1 0 0 0 0
10 1 -1 -1 0 1 0 0 0 -1 -1 0 0
11 1 -1 -1 0 0 1 0 0 0 0 -1 -1
12 1 -1 -1 -1 -1 -1 1 1 1 1 1 1
...
I have two questions:
(1) I assume the 1st column (under intercept) is the overall mean, the
2rd column (under a1) is the difference between the 1st level of
factor a and the overall mean, the 4th column (under b1) is the
difference between the 1st level of factor b and the overall mean. Is
this interpretation correct?
(2) I'm not so sure about those interaction columns. For example, what
is a1:b1? Is it the 1st level of factor a at the 1st level of factor b
versus the overall mean, or something more complicated?
Thanks in advance for your help,
Gang
More information about the R-help
mailing list