[R-sig-ME] More naive questions: Speed comparisons? what is a "stack imbalance" in lmer? does lmer center variables?

Wed Sep 23 18:34:52 CEST 2009

I forgot to cc: the list on this reply.

---------- Forwarded message ----------
From: Douglas Bates <bates at stat.wisc.edu>
Date: Wed, Sep 23, 2009 at 11:29 AM
Subject: Re: [R-sig-ME] More naive questions: Speed comparisons? what
is a "stack imbalance" in lmer? does lmer center variables?
To: Paul Johnson <pauljohn32 at gmail.com>

On Wed, Sep 23, 2009 at 1:36 AM, Paul Johnson <pauljohn32 at gmail.com> wrote:
> Sent this to r-sig-debian by mistake the first time.  Depressing.

> 1.  One general question for general discussion:

> Is HLM6 faster than lmer? If so, why?

> I'm always advocating R to students, but some faculty members are
> skeptical.  A colleague compared the commercial HLM6 software to lmer.
>  HLM6 seems to fit the model in 1 second, but lmer takes 60 seconds.

> If you have HLM6 (I don't), can you tell me if you see similar differences?

Honestly, Paul, I don't see a point in discussing vague generalities
and rumors.  If your colleague can make the data (or even simuiated
data with a similar structure) available and describe the model being
fit then we can see why lmer appears to be so much slower on it.  But
do bear in mind that the folks marketing HLM6 have the ability -
indeed they have the right - to run lmer on their test cases and to
examine every single line of code in lmer so they can determine
exactly what it is doing.  They have the right to do that because
everyone has the right to do that.  If we want to find out how HLM6
performs on a test case we are reduced to your approach of asking if
someone has a copy of the software (on one of or perhaps the only
operating system under which it runs) and all that can be done is to
quote the results.  The internal details of HLM6 are proprietary.

I can tell you my goals in developing the lme4 package are:
 1. Provide the ability to fit general forms of mixed-effects models.
 2. To the best of my ability, provide reliable estimates.
 3. Allow for models to be fit to very large data sets.
 4. Take advantage of the R environment to allow for straightforward
model specification and for very general data manipulation and
graphical capabilities.
 5. Subject to objectives 1-4 provide the fastest software that I can

> My first thought was that LM6 uses PQL by default, and it would be
> faster.  However, in the output, HLM6 says:

> Method of estimation: restricted maximum likelihood

> But that doesn't tell me what quadrature approach they use, does it?

To me this is very confusing.  In my world REML is only meaningful for
linear mixed models.  (I know, at one time Mary Lindstrom and I wrote
about REML estimates for nonlinear mixed models but, as I have said
before, I consider that a youthful indiscretion.)  For linear mixed
models there is no need to use PQL or to discuss quadrature because
the evaluation of the profiled deviance or the profiled REML criterion
is a direct calculation.

Again, without a more complete description of the data and the model
to be fit it is impossible to discuss these issues meaningfully.

> Another explanation for the difference in time might be the way HLM6
> saves the results of some matrix calculations and re-uses them behind
> the scenes.  If every call to lmer is re-calculating some big matrix
> results, I suppose that could explain it.

But you don't need to speculate about what lmer does.  It is Open
Source so you can check for yourself.

However, this does bring up another point which is the need to compare
apples with apples when you are benchmarking software.  If the data
import and model specification stages in HLM6 create the necessary
matrix structures for performing the iterative fit then does the time
to fit the model consist solely of the optimization and summary
stages?  Using an expression like

system.time(fm1 <- lmer(...))

is assessing the time to take the original data, which could be in a
very general form, create all those internal structures and perform
the optimization.

You should bear in mind that a lot of that construction of the model
structures is written in R exactly so that it is capable of fitting
very general model specifications.  The code in HLM6 is, I imagine,
compiled code, which is possible because it targets a very specific
task, and compiled code is always going to be much faster than
interpreted code.

> There are comparisons from 2006 here

> http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/tables.shtml

> that indicate that lme was much slower than HLM, but that doesn't help
> me understand *why* there is a difference.

I believe that those examples are available for testing with lmer in
the mlmRev package for R.  However, the results can't be compared
meaningfully to the tabled results because today's hardware is so
different from what was used for those comparisons.

Besides, none of those examples is really challenging.  I was just
doing some timing tests on a model fit to 1.7 million observations
with 750,000 random effects associated with four different, non-nested
grouping factors and 40 fixed-effects parameters.  To me that is a
meaningful test because it took from 1 to 2 hours to fit on a fast
server computer.  Unfortunately, I can't make that data available
because of confidentiality restrictions but, even if I could, I don't
think the model could be fit in HLM6 or MLWin or SAS PROC MIXED or
Stata because there are 750,000 random effects in a non-nested
configuration.

I can make a similar but smaller test case available using the star
(Student-Teacher Achievement Ratio) data from the mlmRev package.  One
model could be

> system.time(fm1 <- lmer(math ~ gr + cltype + sx + eth + (1|id) + (1|tch) + (1|sch), star))
  user  system elapsed
 9.885   0.044  10.508
> print(fm1, corr = FALSE)
Linear mixed model fit by REML
Formula: math ~ gr + cltype + sx + eth + (1 | id) + (1 | tch) + (1 | sch)
  Data: star
   AIC    BIC  logLik deviance REMLdev
 239236 239365 -119602   239245  239204
Random effects:
 Groups   Name        Variance Std.Dev.
 id       (Intercept) 1001.60  31.648
 tch      (Intercept)  295.03  17.176
 sch      (Intercept)  104.78  10.236
 Residual              397.32  19.933
Number of obs: 24578, groups: id, 10732; tch, 1374; sch, 80

Fixed effects:
           Estimate Std. Error t value
(Intercept) 560.8868     1.5745   356.2
gr.L         95.6126     1.0080    94.9
gr.Q         -4.6783     0.9857    -4.7
gr.C         -3.2320     0.9729    -3.3
cltypereg    -7.7943     1.3322    -5.9
cltypereg+A  -7.0055     1.3265    -5.3
sxF           2.8766     0.6836     4.2
ethB        -21.9197     1.2219   -17.9
ethA          3.1940     6.7037     0.5
ethH          4.5411     9.6653     0.5
ethI        -28.1618    13.5359    -2.1
ethO          3.4356     8.3133     0.4

Unfortunately, because that model is based on only 25000 observations
and 11000 random effects it is difficult to get a meaningful timing.
Timings under 10 seconds, like this, are subject to too much
variability.

Nevertheless you can take that data and try to fit that model in any
of the other software systems.  Note that student (the "id" factor) is
not nested within teacher ("tch") or school, as will almost always
happen when fitting longitudinal data so the random effects are
partially crossed.  In other words, they are not nested so the
structure is not hierarchical or multi-level in the sense of HLM or
MLWin.  I would invite those who have access to such software to fit
the model using lme4 and to fit an equivalent model in other systems
and tell us how they compare.  I can't do that because I choose not to
use proprietary software.

> 2. What does "stack imbalance in .Call" mean in lmer?

It is a sign of a bug in the C code underlying the lme4 package.  It
should be reported as a bug if it occurred in the currently released
version of lme4.

> Here's why I ask.  Searching for comparisons of lmer and HLM,  I went
> to CRAN &  I checked this document:
>
> http://cran.r-project.org/web/packages/mlmRev/vignettes/MlmSoftRev.pdf
>
> I *think* these things are automatically generated.  The version
> that's up there at this moment  (mlmRev edition 0.99875-1)  has pages
> full of the error message:
>
> stack imbalance in .Call,
>
> Were those always there?  I don't think so.   What do they mean?

No they shouldn't be there.  I'll update the mlmRev package.

> 3. In the HLM6 output, there is a message at the end of the variable list:
>
> '%' - This level-1 predictor has been centered around its grand mean.
> '$' - This level-2 predictor has been centered around its grand mean.
>
> What effect does that have on the estimates?  I believe it should have
> no effect on the fixed effect slope estimates, but it seems to me the
> estimates of the variances of random parameters would be
> changed.  In order to make the estimates from lmer as directly
> comparable as possible, should I manually center all of the variables
> before fitting the model?   I'm a little stumped on how to center a
> multi-category factor before feeding it to lmer.  Know what I mean?