[R-sig-ME] lmer: ML and REML estimation

Wed Mar 25 23:21:57 CET 2009

Douglas Bates wrote:

> I would claim that maximum likelihood estimates are well-defined for
> generalized linear mixed models but REML estimates are not. (It is
> true that Mary Lindstrom and I did offer a definition of REML
> estimates for nonlinear mixed-effects models but I consider that a
> youthful indiscretion and I didn't inhale. :-)
> 
> The bottom line is that REML only makes sense for linear mixed-effects models.

  Thank you!  Good to see that clarified.  Looks like we got it
wrong, or at least misleading, in our recent TREE paper -- oh well,
science marches on.

  Might anyone here be able to point to a citation (other than "D.
Bates, r-sig-mixed-models mailing list, 25 March 2009) that would
support this statement ... ?

  For what it's worth (risking the wrath of the GODS), PROC NLMIXED
doesn't try to do REML, but claims it's a computational issue rather
than one of definition:

"With PROC MIXED you can perform both maximum likelihood and restricted
maximum likelihood (REML) estimation, whereas PROC NLMIXED only
implements maximum likelihood. This is because the analog to the REML
method in PROC NLMIXED would involve a high dimensional integral over
all of the fixed-effects parameters, and this integral is typically not
available in closed form."
<http://www.sfu.ca/sasdoc/sashtml/stat/chap46/sect4.htm>

On the other hand, GLIMMIX lets you go ahead and hang yourself:
                                                    "Additionally,
GLIMMIX allows the use of restricted maximum likelihood (REML) methods,
which have been shown to produce better estimates than full maximum
likelihood (ML) when the number of higher-level units is small. REML is
not available in NLMIXED."
<www.nesug.org/proceedings/nesug06/an/da08.pdf>

This is shortly after stating that
                                                       "GLIMMIX, in
contrast, can produce potentially biased estimates for both fixed
effects and covariance parameters, especially for binary data
(Schabenberger 2005)."  (!! see also Breslow 2003)

  Does anyone out there have a suggestion/defense for when it *is*
acceptable to use PQL/MQL to fit binary GLMMs?

  A further question: do you think it will generally be true that ML
estimates of random effects variances will be slightly biased downwards
because we don't have an analogue of REML?  (A wild guess, but I would
think that the mean of the posterior distribution of the variance
estimate (with uninformative priors) would be unbiased since it averages
across the variation in the estimate of the fixed effects????)

  cheers
    Ben Bolker

-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc