[R] Contrast matrices for nested factors
Fernando Henrique Ferraz P. da Rosa
mentus at gmx.de
Tue Sep 7 05:01:49 CEST 2004
Hi, I'd like to know if it's possible to specify different
contrast matrices in lm() for a factor that is nested within another one. This
is useful when we have a model where the nested factor has a different
number of levels, depending on the main factor.
Let me illustrate with an example to make it clearer. Consider
the following data set:
set.seed(1)
y <- rnorm(14)
a <- factor(c(rep(1,7),rep(2,3),rep(3,4)))
b <- factor(c(1,1,1,2,2,3,3,1,1,2,1,1,2,2))
k <- factor(c(1,2,3,1,2,1,2,1,2,1,1,2,1,2))
internal <- data.frame(y,a,b,k)
Where y is an arbitrary response, a is a main factor, b is a
factor nested within a, and k is the replicate number. It is easy to see
that depending on the level of a, b has different numbers of levels. For
instance, when a = 1, we have that b might assume values 1, 2 or 3,
while a = 2 or 3, b might assume only 1 or 2.
I'd like then to use contrasts summing to 0, so I issue:
z <- lm(y ~ a + a/b,data=internal,contrasts=list(a=contr.sum,
b=contr.sum))
The problem is, the design matrix is not quite what I expected.
What happens is, instead of using a different contrast matrix for each
level of a where b is nested, it's using the same contrast matrix for
every b, namely:
> contr.sum(3)
[,1] [,2]
1 1 0
2 0 1
3 -1 -1
So, when a=1, the columns of the design matrix are as expected.
It sums to 0, because there are levels of b 1, 2 and 3, when a=1. But,
when a=2 or a=3, the same contrast matrix is being used, and then, the
factor effects do not sum to 0. That's obviously because there are no
values for b equal 3, when a != 1, and then the coding that gets done is
'0' or '1'.
The design matrix lm() is creating is:
> model.matrix(z)
(Intercept) a1 a2 a1:b1 a2:b1 a3:b1 a1:b2 a2:b2 a3:b2
1 1 1 0 1 0 0 0 0 0
2 1 1 0 1 0 0 0 0 0
3 1 1 0 1 0 0 0 0 0
4 1 1 0 0 0 0 1 0 0
5 1 1 0 0 0 0 1 0 0
6 1 1 0 -1 0 0 -1 0 0
7 1 1 0 -1 0 0 -1 0 0
8 1 0 1 0 1 0 0 0 0
9 1 0 1 0 1 0 0 0 0
10 1 0 1 0 0 0 0 1 0
11 1 -1 -1 0 0 1 0 0 0
12 1 -1 -1 0 0 1 0 0 0
13 1 -1 -1 0 0 0 0 0 1
14 1 -1 -1 0 0 0 0 0 1
What I would like to use is:
(Intercept) a1 a2 a1:b1 a2:b1 a3:b1 a1:b2
1 1 1 0 1 0 0 0 0 0
2 1 1 0 1 0 0 0 0 0
3 1 1 0 1 0 0 0 0 0
4 1 1 0 0 0 0 1 0 0
5 1 1 0 0 0 0 1 0 0
6 1 1 0 -1 0 0 -1 0 0
7 1 1 0 -1 0 0 -1 0 0
8 1 0 1 0 1 0 0 0 0
9 1 0 1 0 1 0 0 0 0
10 1 0 1 0 -1 0 0 0 0
11 1 -1 -1 0 0 1 0 0 0
12 1 -1 -1 0 0 1 0 0 0
13 1 -1 -1 0 0 -1 0 0 0
14 1 -1 -1 0 0 -1 0 0 0
(notice that in the second matrix all collumns sum to 0, in the
first they don't).
Thank you,
--
Fernando Henrique Ferraz P. da Rosa
http://www.ime.usp.br/~feferraz
More information about the R-help
mailing list