[R-sig-ME] valid estimates using lme4?

Fri Oct 28 17:43:15 CEST 2011

On Fri, Oct 28, 2011 at 9:04 AM, Vernooij, J.C.M. (Hans)
<J.C.M.Vernooij at uu.nl> wrote:
> Dear list members,
>
> For a concept article we used package lme4 for a logistic regression. A reviewer doubts about the validity of the outcomes:
> "I strongly urge you to compare the outcomes of lme4 in R with a validated statistical package (SAS, STATA, SPSS) as lme4 is known not to be the best, especially when the Laplace approximation is being used as the default is only one (!) integration point". (quoted)

Rather strongly worded I would say.  I wouldn't suggest arguing with a
referee but I wonder in what sense he/she believes that SAS, STATA and
SPSS are "validated".  If the referee believes that the vendors
guarantee correct answers he/she hasn't read the software licenses.

The issue here is the method of evaluating an approximation to the
likelihood in a GLMM.  I have been involved in such discussions for
over 20 years, although initially for nonlinear mixed-effects models
rather than GLMMs.  When the reviewer is bemoaning the use of one
integration point they are not taking into account the fact that the
approximation is being evaluated at the conditional mode of the random
effects.  That is every evaluation of the Laplace approximation (and
the adaptive Gauss-Hermite quadrature evaluation, when it is done
properly) itself involves an optimization, using penalized iteratively
reweighted least squares, to determine the conditional mode of the
random effects.  Once that is determined the conditional distribution
of each random effect is approximated as a Gaussian distribution
centered at the mode and with the standard deviation matching the
second-order approximation.

SAS provides for the approximation to use additional evaluations at
the Gauss-Hermite quadrature points.  Interestingly they cite a paper
that Jose Pinheiro and I wrote as the reference on which they based
this method.  This will certainly provide a better approximation but
exactly how much better is not clear.  I don't know of papers in which
the approximation of the integral itself was compared to see how much
is gained.

It may seem that this issue could be put to rest by incorporating an
adaptive Gauss-Hermite method in glmer and, as Dimitriris points out,
there has been such a method in versions of glmer but only for very
specific models. We will add it but right now we are concentrating on
other issues in the development.  We should point out that Laplace
versus adaptive Gauss-Hermite is related to the approximation of the
log-likelihood.  In some ways I think that reliable optimization of
the approximate log-likelihood is more important and that is an area
where R is not strong.  Far too much optimization code is covered by
licenses that are not compatible with R's license and the pickings for
Open Source optimization code are somewhat slim.  I wish I had access
to some of the optimizers that SAS uses but we don't so we make use of
what we do have.

> How to repond to this? In http://glmm.wikidot.com/faq the Laplace estimation is said to be less accurate than Gaus-Hermite quadrature or MCMC methods but is the difference in estimates such that the results are not valid? Should we validate the results by running different packages ? Undoubtly we will find differences so what results to report?
> What answer might convince the reviewer?
>
> Thanks,
> Hans
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>