[R-sig-ME] anova function gives (irrelevant) results on two models with different numbers of observations

Douglas Bates bates at stat.wisc.edu
Sun Apr 13 18:05:04 CEST 2008


On Sat, Apr 12, 2008 at 11:29 AM, Olivier Renaud
<Olivier.Renaud at pse.unige.ch> wrote:
> Hi,

>  I have encountered what I believe is a bug or an extremely unwanted
>  "feature" in lme4. In short, the anova function does not check that the
>  two models are based on the same number of observations and can thus
>  give irrelevant results, as exemplified below. Note that lme (at least
>  on S+) correctly refuses to compare the models.

Perhaps you would like to contribute the code that performs the
appropriate checks on the models before creating the analysis of
variance table. Check out a copy of the lme4 package sources from the
R-forge repository and search for

setMethod("anova"

in the file pkg/R/lmer.R  You will see that there are already checks
in that method for the models being fit to the same data argument (if
used).  Patches are welcome.


>  I found this problem, because I wanted to understand why the LRT gave
>  extremely different value than the Wald/t-stat index/statistics/test
>  (choose one ;-) ), see below. It took me a while to realize that the
>  number of observations was not the same in the two models (156 vs. 160).
>  It happens when an explanatory variable is present in only one of the
>  two models and contains NA's, which in not an uncommon situation. The
>  MLdeviances then are very different but not comparable. I'm afraid it
>  might have unnoticed by other users. I have reproduced the calls in lmer
>  (R) and lme (S+).
>
>  Olivier
>
>  R version 2.6.1 (2007-11-26)
>  Package:       lme4
>  Version:       0.99875-9
>   > (Fu.mod0.lmer <- lmer(Y ~ 1 + ( 1|subjec), data = Fu,
>  na.action=na.omit, method="ML") )
>  Linear mixed-effects model fit by maximum likelihood
>  Formula: Y ~ 1 + (1 | subjec)
>    Data: Fu
>    AIC   BIC logLik MLdeviance REMLdeviance
>   654.7 660.9 -325.4      650.7        651.4
>  Random effects:
>   Groups   Name        Variance Std.Dev.
>   subjec   (Intercept) 0.93595  0.96745
>   Residual             2.96389  1.72160
>  number of obs: 160, groups: subjec, 16
>
>  Fixed effects:
>             Estimate Std. Error t value
>  (Intercept)   3.4875     0.2775   12.57
>
>   > (Fu.mod1.lmer <- lmer(Y ~ expvariable + ( 1|subjec), data = Fu,
>  na.action=na.omit, method="ML") )
>  Linear mixed-effects model fit by maximum likelihood
>  Formula: Y ~ expvariable + (1 | subjec)
>    Data: Fu
>    AIC   BIC logLik MLdeviance REMLdeviance
>   637.8 646.9 -315.9      631.8        633.1
>  Random effects:
>   Groups   Name        Variance Std.Dev.
>   subjec   (Intercept) 1.0357   1.0177
>   Residual             2.8796   1.6969
>  number of obs: 156, groups: subjec, 16
>
>  Fixed effects:
>             Estimate Std. Error t value
>  (Intercept)   3.2673     0.3145   10.39
>  expvariable   0.4140     0.2856    1.45
>
>  Correlation of Fixed Effects:
>             (Intr)
>  expvariable -0.398
>
>   > anova(Fu.mod0.lmer, Fu.mod1.lmer)
>  Data: Fu
>  Models:
>  Fu.mod0.lmer: Y ~ 1 + (1 | subjec)
>  Fu.mod1.lmer: Y ~ expvariable + (1 | subjec)
>              Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
>  Fu.mod0.lmer  2  654.70  660.85 -325.35
>  Fu.mod1.lmer  3  637.78  646.93 -315.89 18.917      1  1.365e-05 ***
>  ---
>  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
>
>
>  ######## In S+8.0
>
>   > Fu.mod0.lme <- lme(Y ~ 1, random=~1|subjec, data = Fu,
>  na.action=na.omit, method="ML")
>   > summary(Fu.mod0.lme)
>  Linear mixed-effects model fit by maximum likelihood
>   Data: Fu
>        AIC      BIC    logLik
>   656.7007 665.9262 -325.3503
>
>  Random effects:
>   Formula:  ~ 1 | subjec
>         (Intercept) Residual
>  StdDev:   0.9674476 1.721595
>
>  Fixed effects: Y ~ 1
>              Value Std.Error  DF  t-value p-value
>  (Intercept) 3.4875 0.2783988 144 12.52699  <.0001
>
>  Standardized Within-Group Residuals:
>        Min         Q1        Med        Q3      Max
>   -2.244476 -0.7371921 -0.2445665 0.8289158 2.608248
>
>  Number of Observations: 160
>  Number of Groups: 16
>   > Fu.mod1.lme <- lme(Y ~ expvariable, random=~1|subjec, data = Fu,
>  na.action=na.omit, method="ML")
>   > summary(Fu.mod1.lme)
>  Linear mixed-effects model fit by maximum likelihood
>   Data: Fu
>        AIC     BIC    logLik
>   639.7835 651.983 -315.8918
>
>  Random effects:
>   Formula:  ~ 1 | subjec
>         (Intercept) Residual
>  StdDev:    1.017715 1.696945
>
>  Fixed effects: Y ~ expvariable
>                Value Std.Error  DF  t-value p-value
>  (Intercept) 3.267305 0.3164881 139 10.32363  <.0001
>  expvariable 0.413960 0.2874080 139  1.44032   0.152
>
>  Standardized Within-Group Residuals:
>      Min         Q1        Med        Q3      Max
>   -2.4028 -0.6934754 -0.2075537 0.7346619 2.768138
>
>  Number of Observations: 156
>  Number of Groups: 16
>   > anova(Fu.mod0.lme, Fu.mod1.lme)
>  Problem in anova.lme(Fu.mod0.lme, Fu.mod1.lme): All fitted objects must
>  use the same number of observations
>  Use traceback() to see the call stack
>
>  --
>  Olivier.Renaud at pse.unige.ch             http://www.unige.ch/~renaud/
>  Methodology & Data Analysis - Psychology Dept - University of Geneva
>  UniMail, Office 4142  -  40, Bd du Pont d'Arve   -  CH-1211 Geneva 4
>
>  _______________________________________________
>  R-sig-mixed-models at r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>




More information about the R-sig-mixed-models mailing list