[R-sig-ME] glmer takes long time even after restricting iterations

Fri Sep 5 23:05:02 CEST 2014

I will take it as a compliment that you have sufficient confidence in our
software to try to fit such a model.  :-)

Sadly, even with 400,000 observations it is highly unlikely you would be
able to converge to parameter estimates for these modesl and even more
unlikely that the estimates would be meaningful.

The optimization in glmer is different than the optimization in lmer.  For
a linear mixed model the optimization is over the parameters in the
relative covariance matrix only.  In this case it looks like there would be
15 such parameters.  The optimization problem involving even these
parameters would be difficult, as it is likely that the solution will be on
the boundary of the feasible region, representing a singular covariance
matrix.  For glmer the optimization is much more difficult because it is
over the concatenation of the fixed-effects parameters and the covariance
parameters.  I lost track of what the number of fixed-effects parameters is
but that number is large.  As you have seen the first model failed to
converge in 100,000 iterations.  That is not encouraging.

Regarding the warning messages I will let Ben or Steve respond as they know
more about the convergence checks than I do.  I believe those diagnostics
involve creating a finite-difference approximation to the gradient vector
and the Hessian matrix.  The approximation of the Hessian matrix at the
optimum is probably where the time is being spent.

The best advice is to simplify the model.  You say that ALS is a binary
variable, which means that even with 400,000 observations you have only
400,000 bits of information to which to fit the model.  That's not a lot.
 A continuous response provides much more information per observation than
a binary response.

Try to fit the fixed-effects only using glm.  I'm confident that most of
the coefficients will not be significant.

On Fri, Sep 5, 2014 at 1:19 PM, Prachi Sanghavi <prachi.sanghavi at gmail.com>
wrote:

> Hello!
>
> I have a fairly complex multilevel, multivariate logistic model that I am
> trying to fit.  In both models below, the variables injury, AMI, stroke,
> and resp are binary, as well as ALS and most other variables.  There are a
> total of about 400,000 observations.  When I try to fit the model (Original
> Model), I get several warnings, and I have pasted these below.  I am
> largely concerned about number 4.  I think this problem is due to having
> too many parameters in the model, and so I removed several interactions
> that were unnecessary anyway (Modified Model).  I ran the Modified Model
> with a fixed number of iterations, and it finished these quickly enough
> (maybe 20 minutes?).  But then it took another 19 hours to actually stop
> running, during which time I suspect R was doing various checks that led to
> the warnings.  I'm not sure.  When the Modified Model finished, it produced
> the warnings below.
>
> My biggest problem right now is the amount of time it takes for R to stop
> running, even after restricting the number of iterations to 100.  Because
> of this problem, it is impractical to try to figure out how to address the
> warnings.
>
> Can somebody please help me figure out why R is taking so long, even after
> it has finished the 100 iterations?  And what can I do about it?
>
> Thank you!!
>
> Prachi Sanghavi
> Harvard University
>
>
> Original Model and Warnings:
>
> AMI_county_final_2 <- glmer(ALS ~ -1 + AMI + (injury + stroke +
> resp)*(FEMALE + AGE + MTUS_CNT + Asian + Black + Hispanic + Other +
> Custodial + Nursing + Scene + WhiteHigh + BlackHigh + BlackLow +
> IntegratedHigh + IntegratedLow + combinedscore + Year06 + Year07 + Year08 +
> Year09 + Year10 + Metro + Per_College_Plus + Per_Gen_Prac + Any_MedSchlAff
> + Any_Trauma) + (-1 + injury + AMI + stroke + resp | fullcounty),
> family=binomial, data=rbind(IARS,IARS2), verbose=2,
> control=glmerControl(optCtrl=list(maxfun=100)))
>
> Warning messages:
> 1: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,
> :
>   failure to converge in 10000 evaluations
> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 480.605 (tol = 0.001)
> 3: In if (resHess$code != 0) { :
>   the condition has length > 1 and only the first element will be used
> 4: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model is nearly unidentifiable: very large eigenvalue
>  - Rescale variables?;Model is nearly unidentifiable: large eigenvalue
> ratio
>  - Rescale variables?
>
> Modified Model and Warnings:
>
> AMI_county_final_2 <- glmer(ALS ~ -1 + Year06 + Year07 + Year08 + Year09 +
> Year10 + Metro + AMI + (injury + stroke + resp)*(FEMALE + AGE + MTUS_CNT +
> Asian + Black + Hispanic + Other + Custodial + Nursing + Scene + WhiteHigh
> + BlackHigh + BlackLow + IntegratedHigh + IntegratedLow + combinedscore) +
> (-1 + injury + AMI + stroke + resp | fullcounty), family=binomial,
> data=rbind(IARS,IARS2), verbose=2,
> control=glmerControl(optCtrl=list(maxfun=100)))
>
> Warning messages:
> 1: In commonArgs(par, fn, control, environment()) :
>   maxfun < 10 * length(par)^2 is not recommended.
> 2: In optwrap(optimizer, devfun, start, rho$lower, control = control,  :
>   convergence code 1 from bobyqa: bobyqa -- maximum number of function
> evaluations exceeded
> 3: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,
> :
>   failure to converge in 100 evaluations
> 4: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 15923.5 (tol = 0.001)
> 5: In if (resHess$code != 0) { :
>   the condition has length > 1 and only the first element will be used
> 6: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model is nearly unidentifiable: very large eigenvalue
>  - Rescale variables?;Model is nearly unidentifiable: large eigenvalue
> ratio
>  - Rescale variables?
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]