[R-sig-ME] More naive questions: Speed comparisons? what is a "stack imbalance" in lmer? does lmer center variables?

Wed Sep 23 23:19:15 CEST 2009

Dear collagues,

 	Questions of speed, especially comparative speed between R and some
proprietary program, come up with some frequency.  Perhaps we should add a
FAQ that covers this?  Here is my first attempt at such a contribution.

Section: R Speed. I have {heard,seen,shown} that R is {a bit,a lot,eons}
faster than [other program].  Is this true?

0) We assume that the "benchmark" data sets adequately represent the sorts
of tasks you might use the software for, and that it includes the whole
process (including data import/transformation to prepare the model to run).

1) Proprietary software produces unverifiable statistical results: You
cannot deduce or detect when, whether, or how your results are wrong, unless
you compare the results to a trustworthy algorithm.  It is always faster to
use one method (e.g., the trustworthy one) than two.

2) Paying skilled programmers to write, optimize, and compile statistical
algorthims optimizes for speed.  Letting skilled statisticians who know how
to code and are personally motivated to implement an algorithm optimizes for
correctness and robustness of a model and its results.

3) The extensible nature of R algorithms may require more processing time,
which is likely less than the time required to export, import, reformat, and
re-tidy a data set for several programs (HLM6, SPSS, SAS, etc.), much like
it is faster to visit a large department store to buy a frying pan, perfume,
and a suit than it is to drive across three times to specific stores.

4) The speed of R changes based on several factors, including the version of
R you use and in some cases the BLAS against which you link.  If you believe
R is much slower than it should be, be sure you have upgraded to the latest
version and are using the appropriate version for your hardware and software
(e.g., 64-bit R for Snow Leopard).

5) R's code is written to treat arbitrarily-large data sets; many "very
fast" algorithms that other programs could use may well make assumptions
about data size that are intractible for large data sets, leading to
crashes, errors, or results that seem correct but are not.

Further:

6) R is more than a one-trick pony, and the ability to do nearly everything
comes at the expense of microoptimization.  Human triatheletes also do not
and cannot bike, run, or swim as quickly as single-focus bicyclists,
runners, or swimmers.

7) It is probably unreasonable to even expect open-source interpreted code
to be as fast as code written by people whose full-time job is to write and
optimize code. The fact that R's speed is of the same order of magnitude as
"some other program" is itself remarkable.

...in sum, it is possible that a proprietary software system may produce a
result of the form you expect faster than R does, but it is unlikely to be
MUCH faster when you consider the entire data analysis process, and there
are several strong arguments for waiting the few extra seconds.

--Adam

On Wed, 23 Sep 2009, Douglas Bates wrote:

> I forgot to cc: the list on this reply.
>
> ---------- Forwarded message ----------
> From: Douglas Bates <bates at stat.wisc.edu>
> Date: Wed, Sep 23, 2009 at 11:29 AM
> Subject: Re: [R-sig-ME] More naive questions: Speed comparisons? what
> is a "stack imbalance" in lmer? does lmer center variables?
> To: Paul Johnson <pauljohn32 at gmail.com>
>
>
>
> On Wed, Sep 23, 2009 at 1:36 AM, Paul Johnson <pauljohn32 at gmail.com> wrote:
>> Sent this to r-sig-debian by mistake the first time.  Depressing.
>
>> 1.  One general question for general discussion:
>
>> Is HLM6 faster than lmer? If so, why?
>
>> I'm always advocating R to students, but some faculty members are
>> skeptical.  A colleague compared the commercial HLM6 software to lmer.
>>  HLM6 seems to fit the model in 1 second, but lmer takes 60 seconds.
>
>> If you have HLM6 (I don't), can you tell me if you see similar differences?
>
> Honestly, Paul, I don't see a point in discussing vague generalities
> and rumors.  If your colleague can make the data (or even simuiated
> data with a similar structure) available and describe the model being
> fit then we can see why lmer appears to be so much slower on it.  But
> do bear in mind that the folks marketing HLM6 have the ability -
> indeed they have the right - to run lmer on their test cases and to
> examine every single line of code in lmer so they can determine
> exactly what it is doing.  They have the right to do that because
> everyone has the right to do that.  If we want to find out how HLM6
> performs on a test case we are reduced to your approach of asking if
> someone has a copy of the software (on one of or perhaps the only
> operating system under which it runs) and all that can be done is to
> quote the results.  The internal details of HLM6 are proprietary.
>
> I can tell you my goals in developing the lme4 package are:
>  1. Provide the ability to fit general forms of mixed-effects models.
>  2. To the best of my ability, provide reliable estimates.
>  3. Allow for models to be fit to very large data sets.
>  4. Take advantage of the R environment to allow for straightforward
> model specification and for very general data manipulation and
> graphical capabilities.
>  5. Subject to objectives 1-4 provide the fastest software that I can
>
>> My first thought was that LM6 uses PQL by default, and it would be
>> faster.  However, in the output, HLM6 says:
>
>> Method of estimation: restricted maximum likelihood
>
>> But that doesn't tell me what quadrature approach they use, does it?
>
> To me this is very confusing.  In my world REML is only meaningful for
> linear mixed models.  (I know, at one time Mary Lindstrom and I wrote
> about REML estimates for nonlinear mixed models but, as I have said
> before, I consider that a youthful indiscretion.)  For linear mixed
> models there is no need to use PQL or to discuss quadrature because
> the evaluation of the profiled deviance or the profiled REML criterion
> is a direct calculation.
>
> Again, without a more complete description of the data and the model
> to be fit it is impossible to discuss these issues meaningfully.
>
>> Another explanation for the difference in time might be the way HLM6
>> saves the results of some matrix calculations and re-uses them behind
>> the scenes.  If every call to lmer is re-calculating some big matrix
>> results, I suppose that could explain it.
>
> But you don't need to speculate about what lmer does.  It is Open
> Source so you can check for yourself.
>
> However, this does bring up another point which is the need to compare
> apples with apples when you are benchmarking software.  If the data
> import and model specification stages in HLM6 create the necessary
> matrix structures for performing the iterative fit then does the time
> to fit the model consist solely of the optimization and summary
> stages?  Using an expression like
>
> system.time(fm1 <- lmer(...))
>
> is assessing the time to take the original data, which could be in a
> very general form, create all those internal structures and perform
> the optimization.
>
> You should bear in mind that a lot of that construction of the model
> structures is written in R exactly so that it is capable of fitting
> very general model specifications.  The code in HLM6 is, I imagine,
> compiled code, which is possible because it targets a very specific
> task, and compiled code is always going to be much faster than
> interpreted code.
>
>> There are comparisons from 2006 here
>
>> http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/tables.shtml
>
>> that indicate that lme was much slower than HLM, but that doesn't help
>> me understand *why* there is a difference.
>
> I believe that those examples are available for testing with lmer in
> the mlmRev package for R.  However, the results can't be compared
> meaningfully to the tabled results because today's hardware is so
> different from what was used for those comparisons.
>
> Besides, none of those examples is really challenging.  I was just
> doing some timing tests on a model fit to 1.7 million observations
> with 750,000 random effects associated with four different, non-nested
> grouping factors and 40 fixed-effects parameters.  To me that is a
> meaningful test because it took from 1 to 2 hours to fit on a fast
> server computer.  Unfortunately, I can't make that data available
> because of confidentiality restrictions but, even if I could, I don't
> think the model could be fit in HLM6 or MLWin or SAS PROC MIXED or
> Stata because there are 750,000 random effects in a non-nested
> configuration.
>
> I can make a similar but smaller test case available using the star
> (Student-Teacher Achievement Ratio) data from the mlmRev package.  One
> model could be
>
>> system.time(fm1 <- lmer(math ~ gr + cltype + sx + eth + (1|id) + (1|tch) + (1|sch), star))
>   user  system elapsed
>  9.885   0.044  10.508
>> print(fm1, corr = FALSE)
> Linear mixed model fit by REML
> Formula: math ~ gr + cltype + sx + eth + (1 | id) + (1 | tch) + (1 | sch)
>   Data: star
>    AIC    BIC  logLik deviance REMLdev
>  239236 239365 -119602   239245  239204
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  id       (Intercept) 1001.60  31.648
>  tch      (Intercept)  295.03  17.176
>  sch      (Intercept)  104.78  10.236
>  Residual              397.32  19.933
> Number of obs: 24578, groups: id, 10732; tch, 1374; sch, 80
>
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept) 560.8868     1.5745   356.2
> gr.L         95.6126     1.0080    94.9
> gr.Q         -4.6783     0.9857    -4.7
> gr.C         -3.2320     0.9729    -3.3
> cltypereg    -7.7943     1.3322    -5.9
> cltypereg+A  -7.0055     1.3265    -5.3
> sxF           2.8766     0.6836     4.2
> ethB        -21.9197     1.2219   -17.9
> ethA          3.1940     6.7037     0.5
> ethH          4.5411     9.6653     0.5
> ethI        -28.1618    13.5359    -2.1
> ethO          3.4356     8.3133     0.4
>
> Unfortunately, because that model is based on only 25000 observations
> and 11000 random effects it is difficult to get a meaningful timing.
> Timings under 10 seconds, like this, are subject to too much
> variability.
>
> Nevertheless you can take that data and try to fit that model in any
> of the other software systems.  Note that student (the "id" factor) is
> not nested within teacher ("tch") or school, as will almost always
> happen when fitting longitudinal data so the random effects are
> partially crossed.  In other words, they are not nested so the
> structure is not hierarchical or multi-level in the sense of HLM or
> MLWin.  I would invite those who have access to such software to fit
> the model using lme4 and to fit an equivalent model in other systems
> and tell us how they compare.  I can't do that because I choose not to
> use proprietary software.
>
>> 2. What does "stack imbalance in .Call" mean in lmer?
>
> It is a sign of a bug in the C code underlying the lme4 package.  It
> should be reported as a bug if it occurred in the currently released
> version of lme4.
>
>> Here's why I ask.  Searching for comparisons of lmer and HLM,  I went
>> to CRAN &  I checked this document:
>>
>> http://cran.r-project.org/web/packages/mlmRev/vignettes/MlmSoftRev.pdf
>>
>> I *think* these things are automatically generated.  The version
>> that's up there at this moment  (mlmRev edition 0.99875-1)  has pages
>> full of the error message:
>>
>> stack imbalance in .Call,
>>
>> Were those always there?  I don't think so.   What do they mean?
>
> No they shouldn't be there.  I'll update the mlmRev package.
>
>> 3. In the HLM6 output, there is a message at the end of the variable list:
>>
>> '%' - This level-1 predictor has been centered around its grand mean.
>> '$' - This level-2 predictor has been centered around its grand mean.
>>
>> What effect does that have on the estimates?  I believe it should have
>> no effect on the fixed effect slope estimates, but it seems to me the
>> estimates of the variances of random parameters would be
>> changed.  In order to make the estimates from lmer as directly
>> comparable as possible, should I manually center all of the variables
>> before fitting the model?   I'm a little stumped on how to center a
>> multi-category factor before feeding it to lmer.  Know what I mean?
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>