[R-sig-ME] More naive questions: Speed comparisons? what is a "stack imbalance" in lmer? does lmer center variables?

Wed Sep 23 10:12:02 CEST 2009

Hi Paul,

I am not familiar at all with HLM6 (and do not plan to become),
but..

 > 1.  One general question for general discussion:
 >
 > Is HLM6 faster than lmer? If so, why?
 >
 > I'm always advocating R to students, but some faculty members are
 > skeptical.

Nowadays it is unethical not to expose students to R. You would
deny access to a goldmine of statistical algorithms and life-long 
pleasure for students that get interested beyond their courses.

 > A colleague compared the commercial HLM6 software to lmer.
 >  HLM6 seems to fit the model in 1 second, but lmer takes 60 seconds.

I'm afraid there is no concrete example to investigate
(nor information related to versions), but whatever
the outcome may be (and I am 'skeptical' w.r.t. the reported
timings), I would not trust a fast blackbox compared to
software for which the algorithms are publicly available
as well as every single line of code that implements
them.

Some other questions that come to mind are:

Is HLM available on all platforms ?
Is HLM capable of fitting models to huge datasets ?
Can one easily share research results with colleagues in
a way that they can reproduce the results (using free
[in all senses] software ?
Does HLM provide graphics systems coming near to R's ?

 > If you have HLM6 (I don't), can you tell me if you see similar 
differences?

Apparently it is possible to download a trial version for 15 days

http://www.ssicentral.com/hlm/downloads.html

 > My first thought was that LM6 uses PQL by default, and it would be
 > faster.  However, in the output, HLM6 says:
 >
 > Method of estimation: restricted maximum likelihood
 >
 > But that doesn't tell me what quadrature approach they use, does it?
 >
 > Another explanation for the difference in time might be the way HLM6
 > saves the results of some matrix calculations and re-uses them behind
 > the scenes.  If every call to lmer is re-calculating some big matrix
 > results, I suppose that could explain it.
 >
 > There are comparisons from 2006 here
 >
 > 
http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/tables.shtml
 >
 > that indicate that lme was much slower than HLM, but that doesn't help
 > me understand *why* there is a difference.

The knowledgeable may correct me, but lmer internals are entirely 
different from those of lme, so I don't think you can take these
results as a starting point.

Best,
Tobias

> 2. What does "stack imbalance in .Call" mean in lmer?
> 
> Here's why I ask.  Searching for comparisons of lmer and HLM,  I went
> to CRAN &  I checked this document:
> 
> http://cran.r-project.org/web/packages/mlmRev/vignettes/MlmSoftRev.pdf
> 
> I *think* these things are automatically generated.  The version
> that's up there at this moment  (mlmRev edition 0.99875-1)  has pages
> full of the error message:
> 
> stack imbalance in .Call,
> 
> Were those always there?  I don't think so.   What do they mean?
> 
> 3. In the HLM6 output, there is a message at the end of the variable list:
> 
> '%' - This level-1 predictor has been centered around its grand mean.
> '$' - This level-2 predictor has been centered around its grand mean.
> 
> What effect does that have on the estimates?  I believe it should have
> no effect on the fixed effect slope estimates, but it seems to me the
> estimates of the variances of random parameters would be
> changed.  In order to make the estimates from lmer as directly
> comparable as possible, should I manually center all of the variables
> before fitting the model?   I'm a little stumped on how to center a
> multi-category factor before feeding it to lmer.  Know what I mean?
> 
> pj
>