[R-sig-ME] glmer optimization questions

Tue Sep 17 23:36:13 CEST 2013

On 13-09-17 04:25 PM, Tobias Heed wrote:
> Ben,
> 
> thanks for the reply. So for now (till those tools are available), by
> 'assess convergence', do you mean just checking whether the results
> look meaningful and like what I expect from plots?
> 
> For convergence, I have a strange result: With Nelder-Mead, my model
> converges for some factor orders (I mean, the order I put them in the
> function call), but not with others. This seems to be reproducible
> (with the given dataset). So, say, my model converges for response ~
> A * B * C + random, but not for B * A * C + random. The model
> converges with all orders using bobyqa. I found another report like
> this (order effect) in a post somewhere, but it didn't seem to have
> been solved. Order really shouldn't matter, should it? Could this be
> due to starting values for optimization or something like that?

  That is strange.  Can you send data?

  A quick test of convergence should be *something* like

library(lme4)
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
library(numDeriv)
dd <- update(fm1,devFunOnly=TRUE)
hh <- hessian(dd,getME(fm1,"theta"))
evd <- eigen(H, symmetric=TRUE, only.values=TRUE)$values
## should be positive definite

see https://github.com/lme4/lme4/issues/120
for more detailed code from Rune Christensen that implements
a series of convergence checks

> 
> Tobias
> 
> -- 
> --------------------------------------------------------------------------------------------------------------
>
> 
Tobias Heed, PhD
> Biological Psychology and Neuropsychology  |  University of Hamburg 
> Von-Melle-Park 11, Room 206  |  D-20146 Hamburg, Germany Phone: (49)
> 40 - 42838 5831  |  Fax:   (49) 40 - 42838 6591 
> tobias.heed at uni-hamburg.de  |  Website  |  Google Scholar  |
> ResearcherID 
> --------------------------------------------------------------------------------------------------------------
>
>  On 17.09.2013, at 21:27, Ben Bolker <bbolker at gmail.com> wrote:
> 
>> Tobias Heed <tobias.heed at ...> writes:
>> 
>>> 
>>> Hello,
>>> 
>>> I am trying to understand the different options for fitting with
>>> glmer. I have been unable to find an overview over which options
>>> are appropriate in which cases. If there is a document out there
>>> that explains these things, I'd be thankful for a link.
>> 
>> No (want to write one?)
>> 
>>> 
>>> My specific questions are:
>> 
>>> 1, what is the difference in using maxIter in the function call 
>>> vs. using maxfun in glmerControl()? Which one is better or more 
>>> important to change when a model doesn't converge (i.e., what
>>> kind of iteration do they stand for)? Maxiter seems not to be
>>> documented in the help of lme4 1.1.0, does this mean it should
>>> not be used anymore?
>> 
>> maxIter is old/obsolete. maxfun controls the iteration counter in
>> the BOBYQA/Nelder-Mead phase (i.e., optimization over the 'theta'
>> (Cholesky factor of random-effects variance-covariance matrices)
>> parameter vector)
>> 
>>> 2, I have a model that does not converge with Nelder-Mead, but
>>> does converge with bobyqa -- from googling around, it seems that
>>> some people like one or the other better, but are there specific
>>> things I should look out for when using the one or the other? Or,
>>> are there specific cases in which using one or the other would be
>>> more recommendable?
>> 
>> We don't know enough about this (yet) to make strong
>> recommendations
>> 
>>> 3, what kind of result or warning message would indicate that I
>>> should use the restart_edge option?
>> 
>> If you get parameters on the boundary (i.e. 0 variances, +/-1
>> correlations) it may be worth trying.  However, I'm not sure it's
>> actually implemented for glmer!
>> 
>>> 4, I got this warning: 2: In commonArgs(par, fn, control, 
>>> environment()) : maxfun < 10 * length(par)^2 is not recommended. 
>>> par appears to be the vector with parameters passed to the 
>>> optimizer. Is it necessary (or just "better", but not imperative)
>>> to set maxfun to the value indicated in this equation, or higher?
>>> Why is a higher value for maxfun not used automatically when
>>> appropriate - does it have any negative consequences? Can I read
>>> out par easily somewhere?
>> 
>> I believe this is coming from BOBYQA, but I'm not sure.
>> 
>> 
>>> 5, when a model converges only after tinkering with any of the 
>>> options (e.g., optimizer, maxfun, restart_edge) or maxiter, does 
>>> this say anything about the quality or reliability of the fit?
>> 
>> I would certainly be more careful to assess convergence in these
>> cases.  Do the answers look sensible?  (We hope to add some more
>> functionality for checking convergence ...)
>> 
>>> 6, when reporting a GLMM, should these kinds of options be
>>> reported? It doesn't seem that people do, but it would seem
>>> appropriate when they are necessary to achieve convergence etc.,
>>> wouldn't it?
>> 
>> Absolutely.  You should always report *everything* necessary for
>> someone to reproduce your results (in an appendix or online 
>> supplement, if necessary).
>> 
>> cheers Ben Bolker
>> 
>> _______________________________________________ 
>> R-sig-mixed-models at r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>