[R-sig-ME] glmer optimization questions

Wed Sep 18 15:48:26 CEST 2013

On 13-09-18 05:44 AM, Tobias Heed wrote:
> Ben,
> 
> I was preparing the dataset to send to you, and re-ran those GLMMs.
> This time, I got no convergence on any of the different "permutations"
> of the formula.
> I then compared the estimates of the converged run (from yesterday) and
> the not-converged runs (from today), and they are very similar with only
> very small deviations (this is true for both fixed effect estimates and
> random effect correlations and variances).
> I then let the same model which did converge yesterday run 3 times
> today, and it never converged, but the estimation was always similar. I
> had it converge several times yesterday.
> The estimation is also similar to the bobyqa solution which
> (consistently) does converge…
> 
> So it appears not to be a problem of permuting the factors in the
> formula, but rather a failure to replicate convergence (or
> non-convergence) in different runs of the same model with Nelder-Mead.
> This would seem something that could happen depending on the starting
> values for estimation -- are they chosen randomly each time, or are they
> fixed?
> Also, it seems like the problem stems from the end of optimization
> (given that parameters are so close to those of converged models).
> 
> Let me know if you still want to look at the data (given that it seems
> harder to replicate than I thought yesterday, it looks like it might be
> cumbersome to find out what is going on). 
> 
> Best,
> Tobias

  Please do send the data.  There's not *supposed* to be any
non-deterministic component to the lme4 fitting procedures. We have had
problems in the past with internal components of the fitted object not
getting re-set exactly to their starting values, and I think there may
still be some small issues there, so any examples we can get are useful.

  Ben Bolker

> On 17 Sep 2013, at 23:36, Ben Bolker <bbolker at gmail.com
> <mailto:bbolker at gmail.com>> wrote:
> 
>> On 13-09-17 04:25 PM, Tobias Heed wrote:
>>> Ben,
>>>
>>> thanks for the reply. So for now (till those tools are available), by
>>> 'assess convergence', do you mean just checking whether the results
>>> look meaningful and like what I expect from plots?
>>>
>>> For convergence, I have a strange result: With Nelder-Mead, my model
>>> converges for some factor orders (I mean, the order I put them in the
>>> function call), but not with others. This seems to be reproducible
>>> (with the given dataset). So, say, my model converges for response ~
>>> A * B * C + random, but not for B * A * C + random. The model
>>> converges with all orders using bobyqa. I found another report like
>>> this (order effect) in a post somewhere, but it didn't seem to have
>>> been solved. Order really shouldn't matter, should it? Could this be
>>> due to starting values for optimization or something like that?
>>
>>  That is strange.  Can you send data?
>>
>>  A quick test of convergence should be *something* like
>>
>> library(lme4)
>> fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
>> library(numDeriv)
>> dd <- update(fm1,devFunOnly=TRUE)
>> hh <- hessian(dd,getME(fm1,"theta"))
>> evd <- eigen(H, symmetric=TRUE, only.values=TRUE)$values
>> ## should be positive definite
>>
>> see https://github.com/lme4/lme4/issues/120
>> for more detailed code from Rune Christensen that implements
>> a series of convergence checks
>>
>>
>>
>>>
>>> Tobias
>>>
>>> -- 
>>> --------------------------------------------------------------------------------------------------------------
>>>
>>>
>> Tobias Heed, PhD
>>> Biological Psychology and Neuropsychology  |  University of Hamburg
>>> Von-Melle-Park 11, Room 206  |  D-20146 Hamburg, Germany Phone: (49)
>>> 40 - 42838 5831  |  Fax:   (49) 40 - 42838 6591
>>> tobias.heed at uni-hamburg.de <mailto:tobias.heed at uni-hamburg.de>  |
>>>  Website  |  Google Scholar  |
>>> ResearcherID
>>> --------------------------------------------------------------------------------------------------------------
>>>
>>> On 17.09.2013, at 21:27, Ben Bolker <bbolker at gmail.com
>>> <mailto:bbolker at gmail.com>> wrote:
>>>
>>>> Tobias Heed <tobias.heed at ...> writes:
>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> I am trying to understand the different options for fitting with
>>>>> glmer. I have been unable to find an overview over which options
>>>>> are appropriate in which cases. If there is a document out there
>>>>> that explains these things, I'd be thankful for a link.
>>>>
>>>> No (want to write one?)
>>>>
>>>>>
>>>>> My specific questions are:
>>>>
>>>>> 1, what is the difference in using maxIter in the function call
>>>>> vs. using maxfun in glmerControl()? Which one is better or more
>>>>> important to change when a model doesn't converge (i.e., what
>>>>> kind of iteration do they stand for)? Maxiter seems not to be
>>>>> documented in the help of lme4 1.1.0, does this mean it should
>>>>> not be used anymore?
>>>>
>>>> maxIter is old/obsolete. maxfun controls the iteration counter in
>>>> the BOBYQA/Nelder-Mead phase (i.e., optimization over the 'theta'
>>>> (Cholesky factor of random-effects variance-covariance matrices)
>>>> parameter vector)
>>>>
>>>>> 2, I have a model that does not converge with Nelder-Mead, but
>>>>> does converge with bobyqa -- from googling around, it seems that
>>>>> some people like one or the other better, but are there specific
>>>>> things I should look out for when using the one or the other? Or,
>>>>> are there specific cases in which using one or the other would be
>>>>> more recommendable?
>>>>
>>>> We don't know enough about this (yet) to make strong
>>>> recommendations
>>>>
>>>>> 3, what kind of result or warning message would indicate that I
>>>>> should use the restart_edge option?
>>>>
>>>> If you get parameters on the boundary (i.e. 0 variances, +/-1
>>>> correlations) it may be worth trying.  However, I'm not sure it's
>>>> actually implemented for glmer!
>>>>
>>>>> 4, I got this warning: 2: In commonArgs(par, fn, control,
>>>>> environment()) : maxfun < 10 * length(par)^2 is not recommended.
>>>>> par appears to be the vector with parameters passed to the
>>>>> optimizer. Is it necessary (or just "better", but not imperative)
>>>>> to set maxfun to the value indicated in this equation, or higher?
>>>>> Why is a higher value for maxfun not used automatically when
>>>>> appropriate - does it have any negative consequences? Can I read
>>>>> out par easily somewhere?
>>>>
>>>> I believe this is coming from BOBYQA, but I'm not sure.
>>>>
>>>>
>>>>> 5, when a model converges only after tinkering with any of the
>>>>> options (e.g., optimizer, maxfun, restart_edge) or maxiter, does
>>>>> this say anything about the quality or reliability of the fit?
>>>>
>>>> I would certainly be more careful to assess convergence in these
>>>> cases.  Do the answers look sensible?  (We hope to add some more
>>>> functionality for checking convergence ...)
>>>>
>>>>> 6, when reporting a GLMM, should these kinds of options be
>>>>> reported? It doesn't seem that people do, but it would seem
>>>>> appropriate when they are necessary to achieve convergence etc.,
>>>>> wouldn't it?
>>>>
>>>> Absolutely.  You should always report *everything* necessary for
>>>> someone to reproduce your results (in an appendix or online
>>>> supplement, if necessary).
>>>>
>>>> cheers Ben Bolker
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org
>>>> <mailto:R-sig-mixed-models at r-project.org> mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>