[R] AIC models are not all fitted to the same number of observation

Wed Mar 21 15:55:19 CET 2012

Le 21/03/2012 10:56, Patrick Giraudoux a écrit :
> Hi,
>
> Using lme from the package nlme  3.1-103, I meet a strange warning. I 
> am trying to compare to models with:
>
> library(nlme)
> lmez6=lme(lepus~vulpes,random=~1|troncon/an,data=ika_z6_test)
> lmez60=lme(lepus~1,random=~1|troncon/an,data=ika_z6_test)
>
> Both have the same number of observations and groups:
>
> lmez6
> Linear mixed-effects model fit by REML
>   Data: ika_z6_test
>   Log-restricted-likelihood: -2267.756
>   Fixed: lepus ~ vulpes
> (Intercept)      vulpes
>  1.35017117  0.04722338
>
> Random effects:
>  Formula: ~1 | troncon
>         (Intercept)
> StdDev:   0.8080261
>
>  Formula: ~1 | an %in% troncon
>         (Intercept)  Residual
> StdDev:    1.086611 0.4440076
>
> Number of Observations: 1350
> Number of Groups:
>         troncon an %in% troncon
>             169            1350
>
>
> > lmez60
> Linear mixed-effects model fit by REML
>   Data: ika_z6_test
>   Log-restricted-likelihood: -2266.869
>   Fixed: lepus ~ 1
> (Intercept)
>    1.435569
>
> Random effects:
>  Formula: ~1 | troncon
>         (Intercept)
> StdDev:   0.8139646
>
>  Formula: ~1 | an %in% troncon
>         (Intercept)  Residual
> StdDev:    1.086843 0.4445815
>
> Number of Observations: 1350
> Number of Groups:
>         troncon an %in% troncon
>             169            1350
>
> ...but when I want to compare their AIC, I get:
>
> AIC(lmez6,lmez60)
>        df      AIC
> lmez6   5 4545.511
> lmez60  4 4541.737
> Warning message:
> In AIC.default(lmez6, lmez60) :
>   models are not all fitted to the same number of observations
>
>
> Has anybody an explanation about this strange warning ? To what extent 
> this warning may limit the conclusions that could be drawn from AIC 
> comparison ?
>
> Thanks in advance,
>
> Patrick
>
>
>

Sorry to go on on the thread, I have created, but the trouble I meet is 
above my level in stats... Actually, not using AIC but an anova 
approach, I get a more informative message:

anova(lmez6, lmez60)
        Model df      AIC      BIC    logLik   Test  L.Ratio p-value
lmez6      1  5 4545.511 4571.543 -2267.756
lmez60     2  4 4541.737 4562.566 -2266.869 1 vs 2 1.774036  0.1829
Warning message:
In anova.lme(lmez6, lmez60) :
   Fitted objects with different fixed effects. REML comparisons are not 
meaningful.

And fubbling a bit more, I disclosed that this was an effect of fitting 
the model using REML. If fitted using ML, things are going (apparently) 
smoothly:

lmez6=lme(lepus~vulpes,random=~1|troncon/an,data=ika_z6_test,method="ML")
 > lmez60=lme(lepus~1,random=~1|troncon/an,data=ika_z6_test,method="ML")
 > anova(lmez6, lmez60)
        Model df      AIC      BIC    logLik   Test  L.Ratio p-value
lmez6      1  5 4536.406 4562.445 -2263.203
lmez60     2  4 4538.262 4559.093 -2265.131 1 vs 2 3.856102  0.0496

 > AIC(lmez6,lmez60)
        df      AIC
lmez6   5 4536.406
lmez60  4 4538.262

Now I have the following problem. What I understood from Pinheiro and 
Bates's book and some forums, is that ML estimations are biased to some 
extent tending to underestimate variance parameters. So probably not to 
recommend however results looks consistent here.

Thus, I am lost. The two models looks to me clearly embedded (one is 
just a null model with the only intercept to estimate and the other with 
intercept + one independent variable (numeric), both have the same 
random effects, the same response variable and the same number of 
observations). Warnings, from this point of view sounds inconsistent. 
They are probably not, but beyond my understanding...

Any idea ?

Patrick