# [R-sig-ME] lmer: ML and REML estimation

Douglas Bates bates at stat.wisc.edu
Wed Mar 25 14:47:04 CET 2009

```On Tue, Mar 24, 2009 at 9:10 PM, Brant Inman <brant.inman at me.com> wrote:
> Experts:

> I am writing a paper using the results that I obtained this week with lmer
> (thanks Doug Bates).  I wanted to clarify one technical point about lmer to
> make sure that I understand its mechanics correctly when reporting my
> statistical methods in the paper.

> I used lmer to fit several multilevel logistic regression models.  The help
> page for lmer states that these binomial models are estimated with maximum
> likelihood (ML) methods.  My high-level reading on linear mixed effects
> models has suggested that REML estimates are better than ML estimates, I
> wondered whether non-linear likelihoods, like those of the binomial models
> that I have used, can be estimated with REML methods or not.

I think of the REML criterion as an adjustment to the likelihood for
linear mixed-effects models for the purpose of producing the point
estimates of variance components that people think should be returned.
It is patterned after the adjustments for degrees of freedom in
estimating variances from a single sample and for a linear regression
model.  For example, we define the sample variance as the sum of
squared deviations from the sample mean divided by n - 1.  If we were
to create the maximum likelihood estimate of the variance (assuming
the sample is a realization of independent and identically distributed
Gaussian random variables) we would divide by n, not n-1.  Similarly,
in a linear regression model we estimate the residual variance as the
residual sum of squares divided by n - p whereas the maximum
likelihood estimator has n in the denominator.

One way of thinking of this process is to divide the sample space into
two orthogonal linear subspaces where the sample mean or, more
generally, the coefficients of the linear regression model are
determined by the component in the p-dimensional predictor space and
the variance component is defined by the component in the
(n-p)-dimensional space that is orthogonal to the predictor space.
This argument can be carried over to linear mixed models but begins to
break down seriously if you try to carry it over to generalized linear
models or generalized linear mixed models.

I would claim that maximum likelihood estimates are well-defined for
generalized linear mixed models but REML estimates are not. (It is
true that Mary Lindstrom and I did offer a definition of REML
estimates for nonlinear mixed-effects models but I consider that a
youthful indiscretion and I didn't inhale. :-)

The bottom line is that REML only makes sense for linear mixed-effects models.

> If this
> question is obviously stupid, keep in mind that I am just a dumb surgeon,
> not a statistician.

```