[R] glm model syntax

Berwin A Turlach berwin at maths.uwa.edu.au
Fri May 16 18:27:15 CEST 2008


G'day Harold,

On Fri, 16 May 2008 11:43:32 -0400
"Doran, Harold" <HDoran at air.org> wrote:

> N+M gives only the main effects, N:M gives only the interaction, and
> G*M gives the main effects and the interaction. 

I guess this begs the question what you mean with "N:M gives only the
interaction" ;-)

Consider:

R> (M <- gl(2, 1, length=12))
 [1] 1 2 1 2 1 2 1 2 1 2 1 2
Levels: 1 2
R> (N <- gl(2, 6))
 [1] 1 1 1 1 1 1 2 2 2 2 2 2
Levels: 1 2
R> dat <- data.frame(y= rnorm(12), N=N, M=M)
R> dim(model.matrix(y~N+M, dat))
[1] 12  3
R> dim(model.matrix(y~N:M, dat))
[1] 12  5
R> dim(model.matrix(y~N*M, dat))
[1] 12  4

Why has the model matrix of y~N:M more columns than the model matrix of
y~N*M if the former contains the interactions only and the latter
contains main terms and interactions?  Of course, if we leave the dim()
command away, we will see why.  Moreover, it seems that the model
matrix constructed from y~N:M has a redundant column.

Furthermore:

R> D1 <- model.matrix(y~N*M, dat)
R> D2 <- model.matrix(y~N:M, dat)
R> resid(lm(D1~D2-1))

Shows that the column space created by the model matrix of y~N*M is
completely contained within the column space created by the model
matrix of y~N:M, and it is easy to check that the reverse is also
true.  So it seems to me that y~N:M and y~N*M actually fit the same
models.  To see how to construct one design matrix from the other, try:

R> lm(D1~D2-1)

Thus, I guess the answer is that y~N+M fits a model with main terms
only while y~N:M and y~N*M fit the same model, namely a model with main
and interaction terms, these two formulations just create different
design matrices which has to be taken into account if one tries to
interpret the estimates.

Of course, all the above assumes that N and M are actually factors,
something that Birgit did not specify.  If N and M (or only one of
them) is a numeric vector, then the constructed matrices might be
different, but this is left as an exercise. ;-)  (Apparently, if N and
M are both numeric, then your summary is pretty much correct.)

Cheers,

	Berwin

=========================== Full address =============================
Berwin A Turlach                            Tel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability        +65 6515 6650 (self)
Faculty of Science                          FAX : +65 6872 3919       
National University of Singapore
6 Science Drive 2, Blk S16, Level 7          e-mail: statba at nus.edu.sg
Singapore 117546                    http://www.stat.nus.edu.sg/~statba



More information about the R-help mailing list