[R-sig-ME] anova function gives (irrelevant) results on two models with different numbers of observations

Olivier Renaud Olivier.Renaud at pse.unige.ch
Sat Apr 12 18:29:37 CEST 2008


Hi,

I have encountered what I believe is a bug or an extremely unwanted 
"feature" in lme4. In short, the anova function does not check that the 
two models are based on the same number of observations and can thus 
give irrelevant results, as exemplified below. Note that lme (at least 
on S+) correctly refuses to compare the models.

I found this problem, because I wanted to understand why the LRT gave 
extremely different value than the Wald/t-stat index/statistics/test 
(choose one ;-) ), see below. It took me a while to realize that the 
number of observations was not the same in the two models (156 vs. 160). 
It happens when an explanatory variable is present in only one of the 
two models and contains NA's, which in not an uncommon situation. The 
MLdeviances then are very different but not comparable. I'm afraid it 
might have unnoticed by other users. I have reproduced the calls in lmer 
(R) and lme (S+).

Olivier

R version 2.6.1 (2007-11-26)
Package:       lme4
Version:       0.99875-9
 > (Fu.mod0.lmer <- lmer(Y ~ 1 + ( 1|subjec), data = Fu, 
na.action=na.omit, method="ML") )
Linear mixed-effects model fit by maximum likelihood
Formula: Y ~ 1 + (1 | subjec)
   Data: Fu
   AIC   BIC logLik MLdeviance REMLdeviance
 654.7 660.9 -325.4      650.7        651.4
Random effects:
 Groups   Name        Variance Std.Dev.
 subjec   (Intercept) 0.93595  0.96745
 Residual             2.96389  1.72160
number of obs: 160, groups: subjec, 16

Fixed effects:
            Estimate Std. Error t value
(Intercept)   3.4875     0.2775   12.57

 > (Fu.mod1.lmer <- lmer(Y ~ expvariable + ( 1|subjec), data = Fu, 
na.action=na.omit, method="ML") )
Linear mixed-effects model fit by maximum likelihood
Formula: Y ~ expvariable + (1 | subjec)
   Data: Fu
   AIC   BIC logLik MLdeviance REMLdeviance
 637.8 646.9 -315.9      631.8        633.1
Random effects:
 Groups   Name        Variance Std.Dev.
 subjec   (Intercept) 1.0357   1.0177 
 Residual             2.8796   1.6969 
number of obs: 156, groups: subjec, 16

Fixed effects:
            Estimate Std. Error t value
(Intercept)   3.2673     0.3145   10.39
expvariable   0.4140     0.2856    1.45

Correlation of Fixed Effects:
            (Intr)
expvariable -0.398

 > anova(Fu.mod0.lmer, Fu.mod1.lmer)
Data: Fu
Models:
Fu.mod0.lmer: Y ~ 1 + (1 | subjec)
Fu.mod1.lmer: Y ~ expvariable + (1 | subjec)
             Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)   
Fu.mod0.lmer  2  654.70  660.85 -325.35                            
Fu.mod1.lmer  3  637.78  646.93 -315.89 18.917      1  1.365e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1



######## In S+8.0

 > Fu.mod0.lme <- lme(Y ~ 1, random=~1|subjec, data = Fu, 
na.action=na.omit, method="ML")
 > summary(Fu.mod0.lme)
Linear mixed-effects model fit by maximum likelihood
 Data: Fu
       AIC      BIC    logLik
  656.7007 665.9262 -325.3503

Random effects:
 Formula:  ~ 1 | subjec
        (Intercept) Residual
StdDev:   0.9674476 1.721595

Fixed effects: Y ~ 1
             Value Std.Error  DF  t-value p-value
(Intercept) 3.4875 0.2783988 144 12.52699  <.0001

Standardized Within-Group Residuals:
       Min         Q1        Med        Q3      Max
 -2.244476 -0.7371921 -0.2445665 0.8289158 2.608248

Number of Observations: 160
Number of Groups: 16
 > Fu.mod1.lme <- lme(Y ~ expvariable, random=~1|subjec, data = Fu, 
na.action=na.omit, method="ML")
 > summary(Fu.mod1.lme)
Linear mixed-effects model fit by maximum likelihood
 Data: Fu
       AIC     BIC    logLik
  639.7835 651.983 -315.8918

Random effects:
 Formula:  ~ 1 | subjec
        (Intercept) Residual
StdDev:    1.017715 1.696945

Fixed effects: Y ~ expvariable
               Value Std.Error  DF  t-value p-value
(Intercept) 3.267305 0.3164881 139 10.32363  <.0001
expvariable 0.413960 0.2874080 139  1.44032   0.152

Standardized Within-Group Residuals:
     Min         Q1        Med        Q3      Max
 -2.4028 -0.6934754 -0.2075537 0.7346619 2.768138

Number of Observations: 156
Number of Groups: 16
 > anova(Fu.mod0.lme, Fu.mod1.lme)
Problem in anova.lme(Fu.mod0.lme, Fu.mod1.lme): All fitted objects must 
use the same number of observations
Use traceback() to see the call stack

-- 
Olivier.Renaud at pse.unige.ch             http://www.unige.ch/~renaud/
Methodology & Data Analysis - Psychology Dept - University of Geneva
UniMail, Office 4142  -  40, Bd du Pont d'Arve   -  CH-1211 Geneva 4




More information about the R-sig-mixed-models mailing list