[R-sig-ME] Point estimate outside of its confidence interval with lmer
Ben Bolker
bbo|ker @end|ng |rom gm@||@com
Thu May 12 15:54:56 CEST 2022
For what it's worth, there is a (relatively ancient) Github issue
discussing this:
https://github.com/lme4/lme4/issues/325
Further evidence that we ought to add a warning, or at least a
message, saying that the profiles are based on the ML fit.
Can anyone point me to some theory that supports/describes the
process of constructing CIs based on the profile of the REML statistic ... ?
cheers
Ben Bolker
On 2022-05-12 7:01 a.m., Viechtbauer, Wolfgang (NP) wrote:
> Dear Emmanuel,
>
> Please see below for my responses.
>
> Best,
> Wolfgang
>
>> -----Original Message-----
>> From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces using r-project.org] On
>> Behalf Of Emmanuel Curis
>> Sent: Thursday, 12 May, 2022 9:30
>> To: r-sig-mixed-models using r-project.org
>> Subject: [R-sig-ME] Point estimate outside of its confidence interval with lmer
>>
>> Hello,
>>
>> I have encountered an unexpected result for some datasets when using confint
>> after fitting a model with lmer: the confidence intervals for the standard
>> deviations in the model did not included the point estimate, given by summary
>> for instance.
>>
>> I think the problem is a mix of small sample size and REML vs ML, but I would
>> be happy to have confirmation that my interpretation is correct, since I'm
>> not very familiar with profiling and REML vs ML... and wonder if something
>> more problematic is occuring.
>>
>> My interpretation is that lmer fits using REML by default,
>
> Correct.
>
>> hence the variances are estimated "unbiased" (the equivalent of dividing by n
>> - k for the residual variance of a linear model, but I'm not sure exactly
>> what is n and k for random effects variance estimation).
>
> Only for some very special cases can we show that the variance components are
> estimated unbiasedly. However, yes, REML estimation tends to provide estimates
> of variance components that are approximately unbiased in many cases.
>
> Also, for more complex models, changing a ML estimate into a REML one isn't
> simply a multiplication of the ML estimate by some factor that depends on
> things like the sample size (and in more complex models, number of clusters)
> and the number of parameters.
>
>> But when confint is called, using profile, the ML profile is used, hence
>> variance confidence intervals are estimated "biased" (the equivalent
>> of dividing by n). However, when n is small, dividing by n - k and n
>> may give very different results, hence in some cases the profiled
>> confidence interval for ML estimate does not include the REML
>> estimate, for the standard deviations. I guess in practice the
>> difference between REML and ML is more complex than just using n or
>> n-k, but would it be idea?
>
> Correct - see above. So I wouldn't just heuristically apply some 'correction'
> that only comes from simple regression models.
>
> Actually, when constructing a profile likelihood CI for a variance component
> (which is what confint() does by default for lmer model objects), one can also
> profile the restricted log likelihood function, which would guarantee that the
> estimate falls into the CI. I actually thought that this is what confint()
> does for lmer objects (fitted with REML), but I checked and you are correct -
> even when the model was fitted with REML, all profile likelihood CIs
> (including those for variance components) are based on the regular likelihood.
>
> Without a reproducible example, I cannot tell you whether this is really the
> reason for the estimate falling outside of the CI, but it certainly could be.
>
>> Support for this is that
>> 1) when using bootstrap, there seems to be no such discrepancy
>> 2) when fitting using REML = FALSE, the profiled IC is the same and
>> does include the (different) point estimate
>>
>> Does it sound correct?
>
> Not sure how 1) helps to diagnose this, but 2) certainly provides support for
> the hypothesis that the discrepancy arises from different likelihoods being
> used for estimation and profiling.
>
>> Additionnal questions, if this interpretation is correct:
>> - would it make sense to make confidence intervals based on REML
>> profiles, and not ML profiles? if so, how?
>
> As noted above, this can be done for variance components, but I don't see a
> way of doing this with confint() for lmer model objects.
>
>> - wouldn't a warning be a good idea when point estimates are outside
>> CI, with the explaination if it is indeed REML vs ML?
>
> That's up to the developers of lme4 to decide.
>
>> - if this is indeed a "small sample size" problem, I guess in such
>> cases any asymptotic result is difficult to trust, right? Does it
>> mean profiled interval cannot be trusted also, neither nested
>> models tests, and that only bootstrap may be used?
>
> Profiling the likelihood relies on the asymptotic behavior of the likelihood
> ratio test. So yes, profiling likelihood CIs may not have a nominal coverage
> rate in small samples (for some definition of 'small', which is context
> dependent).
>
>> - in such cases, is there any argument to prefer REML over ML or
>> vice-versa?
>
> One can make arguments in favor of both ML and REML. REML tends to provide
> approximately unbiased estimates, but with larger mean-squared error, while ML
> tends to provide negative biased estimates, but with smaller MSE.
>
>> Thanks in advance for your help,
>>
>> --
>> Emmanuel CURIS
>> emmanuel.curis using parisdescartes.fr
>>
>> Page WWW: http://emmanuel.curis.online.fr/index.html
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
More information about the R-sig-mixed-models
mailing list