[R-sig-ME] lme4/glmer convergence warnings

Ben Bolker bbolker at gmail.com
Thu Apr 10 22:32:38 CEST 2014


On 14-04-10 04:28 PM, W Robert Long wrote:
> Hi Ben
> 
> Could you clarify what you mean by
> 
>>    params <- getME(fitted_model,"theta")
>> ## or unlist(getME(fitted_model,c("theta","fixef")) for GLMMs
>>    grad(dd,params)
> 
> This is a logistic model, so I assume I need to use unlist, however it
> seems that "fixef" is not a structure in fitted_model because:
> Error in match.arg(name) :
>   'arg' should be one of “X”, “Z”, “Zt”, “Ztlist”, “y”, “mu”, “u”, “b”,
> “Gp”, “Tp”, “L”, “Lambda”, “Lambdat”, “Lind”, “A”, “RX”, “RZX”, “sigma”,
> “flist”, “beta”, “theta”, “ST”, “REML”, “is_REML”, “n_rtrms”, “n_rfacs”,
> “cnms”, “devcomp”, “offset”, “lower”
> 
> Also, grad() does not seem to be an available function.
> 
> Please advise
> 
> Thanks
> Rob

  Oops, too much of a hurry.

 * We just added 'fixef' as a synonym for 'beta' in getME, but it's only
on github.  unlist(getME(fitted_model,c("theta","beta"))) should work ...
 * grad() is in the numDeriv package -- install/load it first, or use
numDeriv::grad() if it's installed and you don't feel like loading.

  cheers
    Ben Bolker

> 
> 
> 
> 
> On 10/04/2014 20:08, Ben Bolker wrote:
>> On 14-04-10 04:38 AM, W Robert Long wrote:
>>> Hi Ben
>>>
>>> For my model, I get
>>>
>>>>    max(abs(relgrad))
>>> [1] 1.081706
>>>
>>> Does this help ?
>>>
>>> Thanks
>>> Rob
>>>
>>
>>    Thanks.
>>     Unfortunately it means there's still something we don't understand.
>> To sum up:
>>
>>    * we calculate the finite-difference gradients and Hessians with
>> respect to the model parameters at the estimated MLE (or restricted MLE).
>>
>>    - these might be poor estimates; we are using our own hand-rolled,
>> naive central finite difference code rather than the better tested code
>> in the numDeriv package, which uses the more expensive but more accurate
>> Richardson method.  To see if this is the problem, compare
>>
>>    fitted_model at optinfo$derivs$gradient
>>
>> to
>>
>>    dd <- update(fitted_model,devFunOnly=TRUE)
>>    params <- getME(fitted_model,"theta")
>> ## or unlist(getME(fitted_model,c("theta","fixef")) for GLMMs
>>    grad(dd,params)
>>
>> and see if they differ significantly.  I mostly have *not* seen a
>> discrepancy in these, although Rune Haubo has reported some
>> discrepancies (although I think they were in the Hessian rather than the
>> gradient ... ?)
>>
>>    It might make sense to compare the Hessian in @optinfo with
>> hessian(dd,params) from numDeriv as well.
>>
>>   * we then check those gradients, or the relative gradients from
>>
>>    with(fitted_model at optinfo$derivs,solve(Hessian,gradient))
>>
>> against some tolerance.  It makes sense that large gradients at the
>> estimated parameters (where they should be zero) are a problem, but it
>> continues to be unclear to me on exactly what scale they should be
>> small/how we should make this comparison ...
>>
>>    It might make sense to see if John Nash's optfntools package (on
>> r-forge) has any useful tools or ideas, although as I recall (vaguely)
>> those implementations of (e.g.) the KKT criteria
>> <http://en.wikipedia.org/wiki/Karush%E2%80%93Kuhn%E2%80%93Tucker_conditions>
>>
>> were pretty sensitive ...
>>
>>   * as a crude test of whether the actual value you got is a false
>> convergence (I know in this case you don't think it is, but including
>> this for completeness) the things I know to do are (1) try re-fitting
>> with different optimizers, either starting from the putative best fit or
>> from scratch (see
>> <https://github.com/lme4/lme4/blob/master/misc/issues/allFit.R>); (2)
>> try re-starting the _same_ optimizer from the putative best fit; (3)
>> explore the likelihood surface (e.g. with profiling or bbmle::slice2D).
>>
>>    Ben Bolker
>>
>>
>>
>>
>>
>>> On 10/04/2014 03:33, Ben Bolker wrote:
>>>> Ben Bolker <bbolker at ...> writes:
>>>>
>>>>>
>>>>> On 14-04-06 04:31 AM, Tibor Kiss wrote:
>>>>>> Hi,
>>>>>>
>>>>>> being somewhat nonplussed by similar messages, I also applied
>>>>>    Ben's recent suggestion to one of my models
>>>>> to get:
>>>>>>
>>>>>>        Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
>>>>>> 1.343e-05 3.530e-05 5.756e-05 7.631e-05 9.841e-05 1.932e-04
>>>>>>
>>>>>> So following up on Rob's message: What does it mean?
>>>>>>
>>>>>> With kind regards
>>>>>>
>>>>>> Tibor
>>>>>>
>>>>>
>>>>>     It means that on the scale of the _standard deviations_ of the
>>>>> parameters, the estimated gradients at the MLE (or restricted MLE) are
>>>>> not large.  I was surprised in Rob's case that these scaled gradients
>>>>> were not that small; much smaller than without the scaling, but not
>>>>> small enough to make me think  really understand what's going on.
>>>>>
>>>>>     To recapitulate: the appearance of all of these new messages in
>>>>> the
>>>>> latest version of lme4 is **not** due to a degradation or change in
>>>>> the
>>>>> optimization or fitting procedure -- it's due to a new set of
>>>>> convergence tests that we implemented, that we think are giving a
>>>>> lot of
>>>>> false positives.  You can easily shut them off yourself, or raise the
>>>>> tolerance for the warnings (see ?lmerControl/?glmerControl).  As
>>>>> developers, we're a bit stuck now because we don't want to turn the
>>>>> warnings _off_ until we understand the circumstances that are
>>>>> triggering
>>>>> them better, and that takes more time and effort than we've been
>>>>> able to
>>>>> muster so far.
>>>>
>>>>     [much context snipped]
>>>>
>>>>     Just to follow up on this: more technical discussion is going on at
>>>> https://github.com/lme4/lme4/issues/120 ... at present, it is looking
>>>> like scaling the gradient by the hessian is going to solve a lot of
>>>> problems.  If you are experiencing convergence warnings about
>>>> max|grad| that you suspect are false positives, it would be a great
>>>> help if you could try
>>>>
>>>>     relgrad <-
>>>> with(fitted_model at optinfo$derivs,solve(Hessian,gradient))
>>>>     max(abs(relgrad))
>>>>
>>>> check if the result is a small number (e.g. <0.001) and report **one
>>>> way or the other** on this list, or at the Github url above, or
>>>> (least preferred) by e-mailing lme4-authors at lists.r-forge.r-project.org
>>>> We also hope that this test *will* pick up the cases where people have
>>>> reported problems with Nelder-Mead not working properly ...
>>>>
>>>>     Ben Bolker
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list