[R-sig-ME] Refitting model with ML -- why search for optimal theta again?

Mon Mar 19 21:52:00 CET 2018

Dear list,

I apologize in advance if this question is stupid; it's just that there's something that I do not understand and I would like to learn more. I know that for linear mixed models, optimizing via ML provides slightly anticonservative parameter estimates; as I understand it, this is because ML estimates θ without taking into account uncertainty in the estimates for β (since the two are estimated simultaneously). This uncertainty can be taken into account by optimizing the REML criterion instead, which integrates out the unknown β, and hence finds estimates for θ that should be appropriate for *any* value that β could have had.

If this is correct, the REML criterion contains an added constant that varies depending on the provided fixed-effects structure, and hence REML fits with different fixed effects cannot be compared. When asking for the anova() of two REML-fitted lme4 objects, the package thus refits the models using ML:

> library(lme4)
Loading required package: Matrix
> big <- lmer(Reaction ~ Days + (1|Subject),sleepstudy)
> small <- lmer(Reaction ~ (1|Subject),sleepstudy)
> anova(small,big) # refits model
refitting model(s) with ML (instead of REML)
Data: sleepstudy
Models:
small: Reaction ~ (1 | Subject)
big: Reaction ~ Days + (1 | Subject)
      Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
small  3 1916.5 1926.1 -955.27   1910.5                             
big    4 1802.1 1814.8 -897.04   1794.1 116.46      1  < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

This is what I do not understand: why do we need to re-optimize the (now ML) objective function, when we already have the optimal values in big at theta and small at theta? Starting from the REML criterion, we would only need to take out the integrated fixed effects, and then we end up with an ML likelihood, correct? But then, why can't I simply do:

> big.LLML <- update(big,REML=F,devFunOnly=T)(big at theta) #compute ML deviance based on optimal theta
> small.LLML <- update(small,REML=F,devFunOnly=T)(small at theta)
> small.LLML - big.LLML # same difference, without needing to re-fit!
[1] 116.4677

i.e. take the already-found optimal θ values and plug them into the ML formula? Why do we need to search for the optimal θ values again -- even if the values found by ML differ from those found by REML, shouldn't the REML values be preferred?

Thank you very much for any insights,
Cesko