[R-sig-ME] valid estimates using lme4?

Doran, Harold HDoran at air.org
Fri Oct 28 20:25:32 CEST 2011


Paragraph below should also have 

> It is impossible to determine if SAS, Stata, or SPSS are implementing the
> steps they claim to implement since the source code is not available. It is
> one thing to be able to write out the algebraic expression for solving mixed
> models, whether using Henderson's mixed model equations (SAS) or any other
> approach.

It's another thing to know with certainty if the steps they claim are properly explicated in code.

> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-
> bounces at r-project.org] On Behalf Of Doran, Harold
> Sent: Friday, October 28, 2011 2:17 PM
> To: Douglas Bates; Vernooij, J.C.M. (Hans)
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] valid estimates using lme4?
> 
> It is impossible to determine if SAS, Stata, or SPSS are implementing the
> steps they claim to implement since the source code is not available. It is
> one thing to be able to write out the algebraic expression for solving mixed
> models, whether using Henderson's mixed model equations (SAS) or any other
> approach.
> 
> Part of unit testing software does involve simulation and testing to ensure
> one recovers back the anticipated parameters. However, "validation" is NOT
> comparing output from one program to another.
> 
> Differences in various rule implementations (such as when to stop) can alter
> parameter estimates between programs.
> 
> I would proposed validation requires the ability to review
> 
> 1) The mathematical model proposed to implement the mixed model solution
> 2) A review of source code to ensure that code aligns with the mathematical
> model
> 3) Unit testing with some simulation
> 
> Since step (2) is impossible for SAS, Stata, and SPSS, how can they be
> validated? The source code and mathematical model are available for lme4
> functions.
> 
> I assume the reviewer assumes they are valid because they are sold. In which
> case, I'm sure Doug Bates would be happy to collect a donation to bring him up
> to "validation" standards if that is what is required
> 
> 
> 
> 
> > -----Original Message-----
> > From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-
> > bounces at r-project.org] On Behalf Of Douglas Bates
> > Sent: Friday, October 28, 2011 11:43 AM
> > To: Vernooij, J.C.M. (Hans)
> > Cc: r-sig-mixed-models at r-project.org
> > Subject: Re: [R-sig-ME] valid estimates using lme4?
> >
> > On Fri, Oct 28, 2011 at 9:04 AM, Vernooij, J.C.M. (Hans)
> > <J.C.M.Vernooij at uu.nl> wrote:
> > > Dear list members,
> > >
> > > For a concept article we used package lme4 for a logistic regression. A
> > reviewer doubts about the validity of the outcomes:
> > > "I strongly urge you to compare the outcomes of lme4 in R with a validated
> > statistical package (SAS, STATA, SPSS) as lme4 is known not to be the best,
> > especially when the Laplace approximation is being used as the default is
> only
> > one (!) integration point". (quoted)
> >
> > Rather strongly worded I would say.  I wouldn't suggest arguing with a
> > referee but I wonder in what sense he/she believes that SAS, STATA and
> > SPSS are "validated".  If the referee believes that the vendors
> > guarantee correct answers he/she hasn't read the software licenses.
> >
> > The issue here is the method of evaluating an approximation to the
> > likelihood in a GLMM.  I have been involved in such discussions for
> > over 20 years, although initially for nonlinear mixed-effects models
> > rather than GLMMs.  When the reviewer is bemoaning the use of one
> > integration point they are not taking into account the fact that the
> > approximation is being evaluated at the conditional mode of the random
> > effects.  That is every evaluation of the Laplace approximation (and
> > the adaptive Gauss-Hermite quadrature evaluation, when it is done
> > properly) itself involves an optimization, using penalized iteratively
> > reweighted least squares, to determine the conditional mode of the
> > random effects.  Once that is determined the conditional distribution
> > of each random effect is approximated as a Gaussian distribution
> > centered at the mode and with the standard deviation matching the
> > second-order approximation.
> >
> > SAS provides for the approximation to use additional evaluations at
> > the Gauss-Hermite quadrature points.  Interestingly they cite a paper
> > that Jose Pinheiro and I wrote as the reference on which they based
> > this method.  This will certainly provide a better approximation but
> > exactly how much better is not clear.  I don't know of papers in which
> > the approximation of the integral itself was compared to see how much
> > is gained.
> >
> > It may seem that this issue could be put to rest by incorporating an
> > adaptive Gauss-Hermite method in glmer and, as Dimitriris points out,
> > there has been such a method in versions of glmer but only for very
> > specific models. We will add it but right now we are concentrating on
> > other issues in the development.  We should point out that Laplace
> > versus adaptive Gauss-Hermite is related to the approximation of the
> > log-likelihood.  In some ways I think that reliable optimization of
> > the approximate log-likelihood is more important and that is an area
> > where R is not strong.  Far too much optimization code is covered by
> > licenses that are not compatible with R's license and the pickings for
> > Open Source optimization code are somewhat slim.  I wish I had access
> > to some of the optimizers that SAS uses but we don't so we make use of
> > what we do have.
> >
> >
> > > How to repond to this? In http://glmm.wikidot.com/faq the Laplace
> estimation
> > is said to be less accurate than Gaus-Hermite quadrature or MCMC methods but
> > is the difference in estimates such that the results are not valid? Should
> we
> > validate the results by running different packages ? Undoubtly we will find
> > differences so what results to report?
> > > What answer might convince the reviewer?
> > >
> > > Thanks,
> > > Hans
> > >
> > >        [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > R-sig-mixed-models at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the R-sig-mixed-models mailing list