[R-sig-ME] Refitting model with ML -- why search for optimal theta again?

Mon Mar 19 22:03:17 CET 2018

Hi Cesko,

why can't I simply [...] take the already-found optimal θ values and plug
> them into the ML formula? Why do we need to search for the optimal θ values
> again? even if the values found by ML differ from those found by REML,
> shouldn't the REML values be preferred?

Optimizing the likelihood vs. the REML criterion leads to different θ
estimates. The statistical theory underlying the likelihood ratio test only
holds for ML estimates of θ, not REML estimates. You can compute the
likelihood value for the REML estimates, as you do in your code example,
but this doesn't change the fact that they are REML estimates and not ML
estimates.

Jake

On Mon, Mar 19, 2018 at 3:52 PM, Voeten, C.C. <c.c.voeten at hum.leidenuniv.nl>
wrote:

> Dear list,
>
> I apologize in advance if this question is stupid; it's just that there's
> something that I do not understand and I would like to learn more. I know
> that for linear mixed models, optimizing via ML provides slightly
> anticonservative parameter estimates; as I understand it, this is because
> ML estimates θ without taking into account uncertainty in the estimates for
> β (since the two are estimated simultaneously). This uncertainty can be
> taken into account by optimizing the REML criterion instead, which
> integrates out the unknown β, and hence finds estimates for θ that should
> be appropriate for *any* value that β could have had.
>
> If this is correct, the REML criterion contains an added constant that
> varies depending on the provided fixed-effects structure, and hence REML
> fits with different fixed effects cannot be compared. When asking for the
> anova() of two REML-fitted lme4 objects, the package thus refits the models
> using ML:
>
>
> > library(lme4)
> Loading required package: Matrix
> > big <- lmer(Reaction ~ Days + (1|Subject),sleepstudy)
> > small <- lmer(Reaction ~ (1|Subject),sleepstudy)
> > anova(small,big) # refits model
> refitting model(s) with ML (instead of REML)
> Data: sleepstudy
> Models:
> small: Reaction ~ (1 | Subject)
> big: Reaction ~ Days + (1 | Subject)
>       Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
> small  3 1916.5 1926.1 -955.27   1910.5
> big    4 1802.1 1814.8 -897.04   1794.1 116.46      1  < 2.2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>
> This is what I do not understand: why do we need to re-optimize the (now
> ML) objective function, when we already have the optimal values in big at theta
> and small at theta? Starting from the REML criterion, we would only need to
> take out the integrated fixed effects, and then we end up with an ML
> likelihood, correct? But then, why can't I simply do:
>
> > big.LLML <- update(big,REML=F,devFunOnly=T)(big at theta) #compute ML
> deviance based on optimal theta
> > small.LLML <- update(small,REML=F,devFunOnly=T)(small at theta)
> > small.LLML - big.LLML # same difference, without needing to re-fit!
> [1] 116.4677
>
> i.e. take the already-found optimal θ values and plug them into the ML
> formula? Why do we need to search for the optimal θ values again -- even if
> the values found by ML differ from those found by REML, shouldn't the REML
> values be preferred?
>
> Thank you very much for any insights,
> Cesko
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]