[R-sig-ME] lme() vs aov()
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Thu May 22 22:19:54 CEST 2008
Federico Calboli wrote:
> Hi All,
>
> I was playing with a small dataset of 12 observations, a very basic
> nested model, with 3 drugs, 2 sources for each drug and two response
> counts for each source (the response is some medical parameter, of no
> real interest here). The data is:
>
> drug source response
> d1 a 102
> d1 a 104
> d1 q 103
> d1 q 104
> d2 d 108
> d2 d 110
> d2 b 109
> d2 b 108
> d3 l 104
> d3 l 106
> d3 s 105
> d3 s 107
>
> For kicks, and because the data is balanced I thought that I could use
> it to compare the results of aov() with those of lme() -- I know the
> library lme4 and lmer() should be preferred, but the stuff I am
> ultimately testig was done with lme.
>
> In any case I fit 2 models and got 2 different answers:
>
> > mod.lme = lme(response ~ drug, random = ~1|source, dat)
> > mod.aov = aov(response ~ drug + Error(source), dat)
>
> > summary(mod.aov)
>
> Error: source
> Df Sum Sq Mean Sq F value Pr(>F)
> drug 2 61.167 30.583 61.167 0.003703 **
> Residuals 3 1.500 0.500
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Error: Within
> Df Sum Sq Mean Sq F value Pr(>F)
> Residuals 6 9.0 1.5
>
>
> > anova(mod.lme) # I use anova here to directly compare the F-test
> numDF denDF F-value p-value
> (Intercept) 1 6 115207.14 <.0001
> drug 2 3 26.21 0.0126
>
> (incidentally the 3 denDF here make me think the F-test is exactly
> what I'd expect)
>
> Because the results look different, I thought the possibilities are:
>
> 1) I fit 2 different models without realising it
> 2) one model is more conservative than the other
> 3) I'm completely missing some point (despite searching the archives
> of R-help and R-ME)
>
> Just to be pesky, if I check the calculations against the book I got
> the data from (Zar 4th ed, pgg 304-305) they agree with the aov()
> results.
>
> Any illumination is gratefully asked for. I apologise in advance for
> any annoyance past/present/future my question will cause.
>
The gut reaction is that you shouldn't trust lme() in low-df cases, but
in this particular case the issue is different:
> summary(mod.aov)
Error: source
Df Sum Sq Mean Sq F value Pr(>F)
drug 2 61.167 30.583 61.167 0.003703 **
Residuals 3 1.500 0.500
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 6 9.0 1.5
Notice that the Residuals Mean Sq is larger in the Within stratum than
in the source stratum. In terms of a mixed-effects model, this implies a
negative estimate for the variance of the source effect. lme() will have
nothing of that and sets it to zero instead. If you drop the
Error(source) you get the same F as in lme() although the df differ.
(The "negative variance" can be interpreted as negative within-source
correlation, but that only works properly for balanced designs. Long
story...)
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-sig-mixed-models
mailing list