[R] glm model syntax

Fri May 16 19:10:57 CEST 2008

Dear Berwin:

Indeed, it seems I was incorrect. Using your data, it seems that only in
the case that the variables are numeric would my earlier statements be
true, as you note. For example, if we did

lm(y ~ as.numeric(N)+as.numeric(M), dat)
lm(y ~ as.numeric(N)*as.numeric(M), dat)
lm(y ~ as.numeric(N):as.numeric(M), dat) 

Then the latter two are different, but only under the coercion to
numeric.

> -----Original Message-----
> From: Berwin A Turlach [mailto:berwin at maths.uwa.edu.au] 
> Sent: Friday, May 16, 2008 12:27 PM
> To: Doran, Harold
> Cc: Birgit Lemcke; R Hilfe
> Subject: Re: [R] glm model syntax
> 
> G'day Harold,
> 
> On Fri, 16 May 2008 11:43:32 -0400
> "Doran, Harold" <HDoran at air.org> wrote:
> 
> > N+M gives only the main effects, N:M gives only the interaction, and
> > G*M gives the main effects and the interaction. 
> 
> I guess this begs the question what you mean with "N:M gives 
> only the interaction" ;-)
> 
> Consider:
> 
> R> (M <- gl(2, 1, length=12))
>  [1] 1 2 1 2 1 2 1 2 1 2 1 2
> Levels: 1 2
> R> (N <- gl(2, 6))
>  [1] 1 1 1 1 1 1 2 2 2 2 2 2
> Levels: 1 2
> R> dat <- data.frame(y= rnorm(12), N=N, M=M) dim(model.matrix(y~N+M, 
> R> dat))
> [1] 12  3
> R> dim(model.matrix(y~N:M, dat))
> [1] 12  5
> R> dim(model.matrix(y~N*M, dat))
> [1] 12  4
> 
> Why has the model matrix of y~N:M more columns than the model 
> matrix of y~N*M if the former contains the interactions only 
> and the latter contains main terms and interactions?  Of 
> course, if we leave the dim() command away, we will see why.  
> Moreover, it seems that the model matrix constructed from 
> y~N:M has a redundant column.
> 
> Furthermore:
> 
> R> D1 <- model.matrix(y~N*M, dat)
> R> D2 <- model.matrix(y~N:M, dat)
> R> resid(lm(D1~D2-1))
> 
> Shows that the column space created by the model matrix of 
> y~N*M is completely contained within the column space created 
> by the model matrix of y~N:M, and it is easy to check that 
> the reverse is also true.  So it seems to me that y~N:M and 
> y~N*M actually fit the same models.  To see how to construct 
> one design matrix from the other, try:
> 
> R> lm(D1~D2-1)
> 
> Thus, I guess the answer is that y~N+M fits a model with main 
> terms only while y~N:M and y~N*M fit the same model, namely a 
> model with main and interaction terms, these two formulations 
> just create different design matrices which has to be taken 
> into account if one tries to interpret the estimates.
> 
> Of course, all the above assumes that N and M are actually 
> factors, something that Birgit did not specify.  If N and M 
> (or only one of
> them) is a numeric vector, then the constructed matrices 
> might be different, but this is left as an exercise. ;-)  
> (Apparently, if N and M are both numeric, then your summary 
> is pretty much correct.)
> 
> Cheers,
> 
> 	Berwin
> 
> =========================== Full address =============================
> Berwin A Turlach                            Tel.: +65 6515 4416 (secr)
> Dept of Statistics and Applied Probability        +65 6515 6650 (self)
> Faculty of Science                          FAX : +65 6872 3919       
> National University of Singapore
> 6 Science Drive 2, Blk S16, Level 7          e-mail: statba at nus.edu.sg
> Singapore 117546                    http://www.stat.nus.edu.sg/~statba
>