[R] problem with anova() and syntax in lmer

Wed Oct 17 21:57:13 CEST 2007

Dear R user

I have 2 problems with lmer.
The statistical consultance service of my university has recomended to me to 
expose those problems here.

Sorry for this quite long message.
Your help will be greatly appreciated...

Gilles San Martin

1) anova()

I fit a first model :
model1 <- lmer(eclw~1 + density + landsc + temp + landsc:temp + (1|region) + 
(1|region:pop) + (1|region:pop:family), data=fem1)

I fit the same model but I'm just changing the order of 2 fixed factors 
(here : "temp" and "landsc") :
model2 <- lmer(eclw~1 + density + temp + landsc + landsc:temp + (1|region) + 
(1|region:pop) + (1|region:pop:family), data=fem1)

Then, if I apply the anova() function on these 2 models, the given Sum of 
Squares are different for the fixed effects whose place has been changed:

> anova(model1)
Analysis of Variance Table
            Df  Sum Sq Mean Sq
density      1 21941.3 21941.3
landsc       1  4800.7  4800.7
temp         1 10119.9 10119.9
landsc:temp  1   292.2   292.2

> anova(model2)
Analysis of Variance Table
            Df  Sum Sq Mean Sq
density      1 21941.3 21941.3
temp         1 10441.1 10441.1
landsc       1  4479.5  4479.5
temp:landsc  1   292.2   292.2
>

How is it possible? Do the fixed effects need to be writen in a particular 
order ?
My dataset is unbalanced. Somebody tells to me that this could have some 
importance for this problem.

2) syntax

I have a quite complex model and we have not been able to find accurate 
documentation about the syntax corresponding to my model.

I have  :
 - 2 fixed factors : "landsc" & "temp" and their interaction " landsc:temp"
 - 1 continuous covariate considered as fixed
 - 3 nested random factors : "region", "pop" and "family" with family nested 
in pop and pop nested in region*landsc

I'm mainly interrested in the effect of "landsc" ane "landsc:temp" on the 
variable I'm studying.

I had used the following synthax :
model3 <- lmer(eclw~1 + density + landsc + temp + landsc:temp + (1|region) + 
(1|region:pop) + (1|region:pop:family), data=fem1)

But somebody told to me that the folowing one could be more correct , and 
I'm in doubt now:
model4 <- lmer(eclw~1 + density + landsc + temp + landsc:temp + (1|region) + 
(pop|region) + (family|pop), data=fem1)

The variables are coded with unique levels from inner nested factors as 
recomended here (Bates & Pinheiro : lme for SAS PROC MIXED users)  :
http://biostat.hitchcock.org/FacultyandStaff/OnlineManuals/PDF%20Files/lmesas.pdf

Which syntax is the right one and describe de nested structure correctly? 
And what could be the meaning of the wrong model?
Is there somewhere general information about lmer synthax that we could have 
missed  (not just simple examples)?
(I just have an article D. Bates from Rnews vol5/1 and a book of Mr Galwey 
in addition to the lme4 package help).

I have also tried lme  (without the covariate) :
But the denominator DF seem very strange to me considering the containment 
method that is used, so I wonder also if the syntax that I use is correct :

> model5 <-lme(eclw~landsc + temp + landsc*temp , random= ~ 
> 1|region/pop/family ,method="REML", data=femr)
> anova.lme(model5)
            numDF denDF  F-value p-value
(Intercept)     1   332 546.0825  <.0001
landsc          1     9   2.8841  0.1237
temp            1   332  25.7565  <.0001
landsc:temp     1   332   0.4316  0.5117

The number of levels of the factors are : temp : 2 ; landsc : 2 ; region : 2 
; pop : 12 ; family : 34
If I'm not wrong the containment method use the same denominator DF as the 
classical Anova approach.
So here landsc would have to be tested against landsc*region with (2-1) * 
(2-1) = 1 denominator DF.
And the same for temp...

________________________________

Gilles San Martin y Gomez

Biodiversity Research Centre
Ecology & Biogeography Unit
University of Louvain-La-Neuve (UCL)
Croix du Sud 4/5
B-1348 Louvain-la-Neuve
Belgium

Tel. +32 (0)10 47 21 73
E-mail: gilles.sanmartin at gmail.com