[R-sig-ME] anova function gives (irrelevant) results on two models with different numbers of observations
Douglas Bates
bates at stat.wisc.edu
Sun Apr 13 18:05:04 CEST 2008
On Sat, Apr 12, 2008 at 11:29 AM, Olivier Renaud
<Olivier.Renaud at pse.unige.ch> wrote:
> Hi,
> I have encountered what I believe is a bug or an extremely unwanted
> "feature" in lme4. In short, the anova function does not check that the
> two models are based on the same number of observations and can thus
> give irrelevant results, as exemplified below. Note that lme (at least
> on S+) correctly refuses to compare the models.
Perhaps you would like to contribute the code that performs the
appropriate checks on the models before creating the analysis of
variance table. Check out a copy of the lme4 package sources from the
R-forge repository and search for
setMethod("anova"
in the file pkg/R/lmer.R You will see that there are already checks
in that method for the models being fit to the same data argument (if
used). Patches are welcome.
> I found this problem, because I wanted to understand why the LRT gave
> extremely different value than the Wald/t-stat index/statistics/test
> (choose one ;-) ), see below. It took me a while to realize that the
> number of observations was not the same in the two models (156 vs. 160).
> It happens when an explanatory variable is present in only one of the
> two models and contains NA's, which in not an uncommon situation. The
> MLdeviances then are very different but not comparable. I'm afraid it
> might have unnoticed by other users. I have reproduced the calls in lmer
> (R) and lme (S+).
>
> Olivier
>
> R version 2.6.1 (2007-11-26)
> Package: lme4
> Version: 0.99875-9
> > (Fu.mod0.lmer <- lmer(Y ~ 1 + ( 1|subjec), data = Fu,
> na.action=na.omit, method="ML") )
> Linear mixed-effects model fit by maximum likelihood
> Formula: Y ~ 1 + (1 | subjec)
> Data: Fu
> AIC BIC logLik MLdeviance REMLdeviance
> 654.7 660.9 -325.4 650.7 651.4
> Random effects:
> Groups Name Variance Std.Dev.
> subjec (Intercept) 0.93595 0.96745
> Residual 2.96389 1.72160
> number of obs: 160, groups: subjec, 16
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 3.4875 0.2775 12.57
>
> > (Fu.mod1.lmer <- lmer(Y ~ expvariable + ( 1|subjec), data = Fu,
> na.action=na.omit, method="ML") )
> Linear mixed-effects model fit by maximum likelihood
> Formula: Y ~ expvariable + (1 | subjec)
> Data: Fu
> AIC BIC logLik MLdeviance REMLdeviance
> 637.8 646.9 -315.9 631.8 633.1
> Random effects:
> Groups Name Variance Std.Dev.
> subjec (Intercept) 1.0357 1.0177
> Residual 2.8796 1.6969
> number of obs: 156, groups: subjec, 16
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 3.2673 0.3145 10.39
> expvariable 0.4140 0.2856 1.45
>
> Correlation of Fixed Effects:
> (Intr)
> expvariable -0.398
>
> > anova(Fu.mod0.lmer, Fu.mod1.lmer)
> Data: Fu
> Models:
> Fu.mod0.lmer: Y ~ 1 + (1 | subjec)
> Fu.mod1.lmer: Y ~ expvariable + (1 | subjec)
> Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
> Fu.mod0.lmer 2 654.70 660.85 -325.35
> Fu.mod1.lmer 3 637.78 646.93 -315.89 18.917 1 1.365e-05 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
>
>
> ######## In S+8.0
>
> > Fu.mod0.lme <- lme(Y ~ 1, random=~1|subjec, data = Fu,
> na.action=na.omit, method="ML")
> > summary(Fu.mod0.lme)
> Linear mixed-effects model fit by maximum likelihood
> Data: Fu
> AIC BIC logLik
> 656.7007 665.9262 -325.3503
>
> Random effects:
> Formula: ~ 1 | subjec
> (Intercept) Residual
> StdDev: 0.9674476 1.721595
>
> Fixed effects: Y ~ 1
> Value Std.Error DF t-value p-value
> (Intercept) 3.4875 0.2783988 144 12.52699 <.0001
>
> Standardized Within-Group Residuals:
> Min Q1 Med Q3 Max
> -2.244476 -0.7371921 -0.2445665 0.8289158 2.608248
>
> Number of Observations: 160
> Number of Groups: 16
> > Fu.mod1.lme <- lme(Y ~ expvariable, random=~1|subjec, data = Fu,
> na.action=na.omit, method="ML")
> > summary(Fu.mod1.lme)
> Linear mixed-effects model fit by maximum likelihood
> Data: Fu
> AIC BIC logLik
> 639.7835 651.983 -315.8918
>
> Random effects:
> Formula: ~ 1 | subjec
> (Intercept) Residual
> StdDev: 1.017715 1.696945
>
> Fixed effects: Y ~ expvariable
> Value Std.Error DF t-value p-value
> (Intercept) 3.267305 0.3164881 139 10.32363 <.0001
> expvariable 0.413960 0.2874080 139 1.44032 0.152
>
> Standardized Within-Group Residuals:
> Min Q1 Med Q3 Max
> -2.4028 -0.6934754 -0.2075537 0.7346619 2.768138
>
> Number of Observations: 156
> Number of Groups: 16
> > anova(Fu.mod0.lme, Fu.mod1.lme)
> Problem in anova.lme(Fu.mod0.lme, Fu.mod1.lme): All fitted objects must
> use the same number of observations
> Use traceback() to see the call stack
>
> --
> Olivier.Renaud at pse.unige.ch http://www.unige.ch/~renaud/
> Methodology & Data Analysis - Psychology Dept - University of Geneva
> UniMail, Office 4142 - 40, Bd du Pont d'Arve - CH-1211 Geneva 4
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list