# [R-sig-ME] lmer: ML and REML estimation

Douglas Bates bates at stat.wisc.edu
Sat Mar 28 15:49:36 CET 2009

I managed to edit out an important word in one sentence below.

On Sat, Mar 28, 2009 at 9:45 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
> On Fri, Mar 27, 2009 at 1:01 PM, dave fournier <otter at otter-rsch.com> wrote:
>>  Maybe I'm missing something here. The way I see it,
>> in its most general form a mixed model is simply one
>> where some of the parameters are random variables with a prior.
>
> That's a rather general class of models.  Some might use the name
> "Empirical Bayes".
>
> I read the phrase "some of the parameters are random variables" to be
> referring to the random effects.  I phrase things slightly
> differently.  In particular I don't regard the random effects as
> parameters. I regard a mixed-effects model as being based on two
> random variables: the response Y whose value, y, has been observed and
> an unobserved random effects vector B.  The probability model
> describes the conditional distribution of Y, given B = b, and the
> unconditional distribution of B.  For the models that can be fit by
> the lme4 package the unconditional distribution of B is a multivariate
> Gaussian with mean zero and a parameterized variance-covariance
> matrix.  The conditional mean of Y, given B = b, depends on b through
> a linear predictor expression, X\beta + Zb, that also incorporates the
> fixed-effects parameters \beta.
>
> The random effects B are similar to the fixed-effects \beta in that
> they both occur as coefficients in the linear predictor but I find
> that, for me at least, it is important to retain a distinction between
> the random effects and the model parameters.
>
>> So say the likelihood looks like
>>
>>   f(x,u)p(u)
>>
>> where the vector of parameters u have a prior p(u).
>> If the u are normally distributed with a vector of
>> std devs sigma this becomes
>>
>>    f(x,u)p(u,sigma)
>>
>> Integrating over u gives
>>
>>  F(x,sigma) = \Int F(x,u)p(u,sigma) du
>>
>> so the MLE's xhat,sigamhat are found by maximizing
>> F wrt x and sigma.
>
> Well, that was kind of the point of the discussion.  For a linear
> mixed model there are many equivalent ways to arrive at the REML
> criterion.  One way that provides a convenient computational approach
> is to maximize, with respect to the random-effects dispersion
> parameters, the integral of the likelihood over the fixed-effects
> parameters.  The original motivation for the criterion is as the
> maximum likelihood estimator of the variance parameters based on a set
> of contrasts that are orthogonal to the fixed-effects model space.
> That is, these are the maximum likelihood estimators from a set of
> linear combinations of data values that are to a certain subspace, the

that should read "... that are restricted to a certain ..."

> subspace of all possible residuals.  This is why they are called the
> restricted or residual maximum likelihood estimators.
>
> I first saw the relationship between maximizing the integrated
> likelihood and the REML estimators in the 1982 Biometrics paper
> "Random Effects Models for Longitudinal Data" by Laird and Ware but it
> may have been known before that.
>
> It is certainly possible to define estimators that maximize the
> integrated likelihood in more general mixed-effects model
> specifications but should we call these REML estimators and do we know
> that they will share the properties that make REML a popular
> estimation criterion for linear mixed models?
>
> I oversimplified when I said that REML is not defined for GLMMs.  I
> was using REML to refer to the estimators based on orthogonal
> subspaces in the sample space.
>
>> Now if we are willing to take a Bayesian point of view
>
> I thought you were already there when you wrote about "parameters are
> random variables with a prior" :-)
>
>> we could also integrate over x and get the marginal
>> likelihood for the sigma alone. We may need a prior on x,
>> or assume a locally uniform prior. In any event we obtain
>>
>>  G(sigma) = \Int F(x,u)p(u,sigma) du dx
>>
>> The integral can be approximated using the Laplace approximation
>> at the mode of  F(x,u)p(u,sigma).  Now I'm not claiming that this
>> approach always works well, but it is a simple recipe for producing
>> an estimation procedure which may give better estimates for sigma.
>> Using AD Model Builders Random effects module which is freely available
>> at http://admb-project.org it is simple to
>> change the model specification to carry out this procedure. As I said
>> before it seemed to work well in a nonlinear fisheries model I looked
>> at.
>>
>> --
>> David A. Fournier
>> P.O. Box 2040,
>> Sidney, B.C. V8l 3S3
>> Phone/FAX 250-655-3364
>> http://otter-rsch.com
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>