[R-sig-ME] Improving computation time for a binary outcome in lme4

Tue May 25 03:09:02 CEST 2010

On Mon, May 24, 2010 at 7:46 PM, Robin Jeffries <rjeffries at ucla.edu> wrote:
> I am running a mixed effects model with two random effects that have ~500
> and ~1400 factor levels respectively.

> For a continuous outcome, the computation time using lme4 is workable.
> However for a binary outcome the computation time increases 4-80 fold
> compared to a similar model for a continuous outcome. I tend to stop
> computations if they've been running more than 8 hours, so I don't have a
> max time estimate)

There are at least two characteristics of the generalized linear mixed
model that are causing the increase in computational time.  The first
is the fact that the algorithm is based on iteratively reweighted
least squares (IRLS) and not ordinary least squares (OLS).  It is
inevitable that an iterative algorithm is slower than a direct
calculation.

The second cause is the fact that one can "profile out" the
fixed-effects parameters in a linear mixed-effects model but not in a
generalized linear mixed-effects model.  You can fake it to some
extent but the currently released version of the lme4 package doesn't.
 Thus, the greater the number of fixed-effects parameters, the greater
the complexity of the problem.

If you use the verbose option to lmer and to glmer on similar problems
you will see that lmer if optimizing over fewer parameters than is
glmer.

> At least one of the fixed effects is also a 6-level factor. I attempted to
> treat this as a sparse matrix, but lmer() doesn't seem to allow for this
> type of matrix in the model.

As I mentioned in my reply on R-help, the development version of the
lme4 package does have a sparseX option.  For a factor with 6 levels
it is unlikely that it will help.  The sparsity index of the X matrix
will be greater than 1/6 and that is close to the breakpoint where
dense methods, which do more numerical computation but less structural
analysis, are actually faster than sparse methods.

> Are there any suggestions on what I can do (other than simplify the model)
> to improve the computation time for a binary outcome?

There are the usual suspects of getting access to a fast computer with
lots of memory and a 64-bit operating system.  You could see whether
an accelerated BLAS will help.  For example, Revolution R has the MKL
BLAS built-in.  Regrettably, that isn't always a speed boost.  We have
seen situations where multi-threaded BLAS actually slow down sparse
matrix operations because the communications overhead is greater than
the time savings of being able to perform more flops per second.

> Also, could people comment on the speed of MCMCglmm vs lme4? Perhaps I could
> go this route if it will prove to be substantially quicker for a binary
> outcome.

> Thank you to Douglas Bates for suggesting I post here. I think i'll be able
> to find more help using lme4 here than on the normal R-help.
>
> ~~~~~~~~~~~~~~~~~~~
> -Robin Jeffries
> Dr.P.H. Candidate in Biostatistics
> UCLA School of Public Health
> rjeffries at ucla.edu
> 530-624-0428
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>