[R-sig-ME] lme() vs aov()

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu May 22 22:19:54 CEST 2008


Federico Calboli wrote:
> Hi All,
>
> I was playing with a small dataset of 12 observations, a very basic 
> nested model, with 3 drugs, 2 sources for each drug and two response 
> counts for each source (the response is some medical parameter, of no 
> real interest here). The data is:
>
> drug source response
> d1 a 102
> d1 a 104
> d1 q 103
> d1 q 104
> d2 d 108
> d2 d 110
> d2 b 109
> d2 b 108
> d3 l 104
> d3 l 106
> d3 s 105
> d3 s 107
>
> For kicks, and because the data is balanced I thought that I could use 
> it to compare the results of aov() with those of lme() -- I know the 
> library lme4  and lmer() should be preferred, but the stuff I am 
> ultimately testig was done with lme.
>
> In any case I fit 2 models and got 2 different answers:
>
> > mod.lme = lme(response ~ drug, random = ~1|source, dat)
> > mod.aov = aov(response ~ drug + Error(source), dat)
>
> > summary(mod.aov)
>
> Error: source
>           Df Sum Sq Mean Sq F value   Pr(>F)
> drug       2 61.167  30.583  61.167 0.003703 **
> Residuals  3  1.500   0.500
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Error: Within
>           Df Sum Sq Mean Sq F value Pr(>F)
> Residuals  6    9.0     1.5
>
>
> > anova(mod.lme) # I use anova here to directly compare the F-test
>             numDF denDF   F-value p-value
> (Intercept)     1     6 115207.14  <.0001
> drug            2     3     26.21  0.0126
>
> (incidentally the 3 denDF here make me think the F-test is exactly 
> what I'd expect)
>
> Because the results look different, I thought the possibilities are:
>
> 1) I fit 2 different models without realising it
> 2) one model is more conservative than the other
> 3) I'm completely missing some point (despite searching the archives 
> of R-help and R-ME)
>
> Just to be pesky, if I check the calculations against the book I got 
> the data from (Zar 4th ed, pgg 304-305) they agree with the aov() 
> results.
>
> Any illumination is gratefully asked for. I apologise in advance for 
> any annoyance past/present/future my question will cause.
>
The gut reaction is that you shouldn't trust lme() in low-df cases, but 
in this particular case the issue is different:

 > summary(mod.aov)

Error: source
          Df Sum Sq Mean Sq F value   Pr(>F)  
drug       2 61.167  30.583  61.167 0.003703 **
Residuals  3  1.500   0.500                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: Within
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals  6    9.0     1.5              

Notice that the Residuals Mean Sq is larger in the Within stratum than 
in the source stratum. In terms of a mixed-effects model, this implies a 
negative estimate for the variance of the source effect. lme() will have 
nothing of that and sets it to zero instead. If you drop the 
Error(source) you get the same F as in lme() although the df differ.

(The  "negative variance" can be interpreted as negative within-source 
correlation, but that only works properly for balanced designs. Long 
story...)

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-sig-mixed-models mailing list