[R-sig-ME] False convergence

Ben Bolker bbolker at gmail.com
Sat Mar 2 22:34:59 CET 2013


Nick Isaac <njbisaac at ...> writes:

> Dear list,
> 
> I'm running a set of models in glmer(), some of which return the 'false
> convergence' error (cvg=8). I'm trying to understand why.
> 
> My models all have the same basic structure: glmer(P ~ Year + (1|Site),
> binomial), where P is a vector of 0s and 1s. The Year is centered on zero,
> which I've found to greatly reduce the incidence of false convergences in
> the past. There are ~100,000 observations and 9000 sites.
> 
> I've tried a couple of fixes, including the .Call("mer_optimize",...) hack
> as well as an observation-level random effect. the former has no impact on
> the parameter estimates and the latter still returns the false convergence
> warning.
> 
> I'm using verbose=T argument. I've noticed previously that false
> convergence is characterised by just 1 or 2 iterations being completed and
> variances on the random effects that are either close to zero or
> astronomically huge. But my models are running to dozens of iterations with
> sensible looking variances that change moderately among iterations but then
> stabilise during the last few iterations. And the parameter estimates look
> sensible.
> 
> In other words, the models do not show any evidence of failure, except the
> warning message. So which should I believe: the verbose trace or the
> warning message?
> 
> Perhaps someone can give me further insight into why glmer() thinks the
> model has not properly converged

  In general I would believe the verbose trace ...

  The stable version of lme4 is using the nlminb() optimizer internally,
which in turn is based on the PORT libraries

The docs linked from ?nlminb:

http://netlib.bell-labs.com/cm/cs/cstr/153.pdf

The only useful material I could find in these docs was:

------------
p. 5: false convergence: the gradient ∇f(x) may be computed
incorrectly, the other stopping tolerances may be too tight, or either
f or ∇f may be discontinuous near the current iterate x.

p. 9: V(XFTOL) — V(34) is the false-convergence tolerance. A return
with IV(1) = 8 occurs if a more favorable stopping test is not
satisfied and if a step of scaled length at most V(XFTOL) is tried but
not accepted. ‘‘Scaled length’’ is in the sense of (5.1). Such a
return generally means there is an error in computing ∇f(x), or
the favorable convergence tolerances (V(RFCTOL), V(XCTOL), and
perhaps V(AFCTOL)) are too tight for the accuracy to which f(x) is
computed (see §9), or ∇f (or f itself) is discontinuous near x . An
error in computing ∇f(x) usually leads to false convergence after
only a few iterations — often in the first.  Default = 100*MACHEP.

p. 13: Sometimes evaluating f(x) involves an extensive computation,
such as performing a simulation or adaptive numerical quadrature or
integrating an ordinary or partial differential equation. In such
cases the value computed for f (x), say f̃( x ), may involve
substantial error (in the eyes of the optimization algorithm).  To
eliminate some ‘‘false convergence’’ messages and useless function
evaluations, it is necessary to increase the stopping tolerances and,
when finite-difference derivative approximations are used, to increase
the step-sizes used in estimating derivatives.
----------

"evaluating f(x) involves an extensive computation" is a reasonably
good description of what's going on inside lme4 (although I think the
internal computations are _slightly_ less involved/noisy than a
typical ODE solution or generic integration by quadrature).

Are all of your covariates (year, site) unique, or can you collapse
the data to a binomial variable?  That might help a lot with both
speed and stability, and should be functionally equivalent ...

Observation-level random effects should have essentially no effect
for a Bernoulli variable.

nlminb() is notoriously cryptic/sensitive: it might be worth
checking the development version of lme4, which uses a more
robust optimizer by default (although you can also use nlminb(),
for backward comparison) and allows more control/investigation
of the optimization.

  Ben Bolker



More information about the R-sig-mixed-models mailing list