[R-sig-ME] More naive questions: Speed comparisons? what is a "stack imbalance" in lmer? does lmer center variables?

Wed Sep 23 08:36:46 CEST 2009

Sent this to r-sig-debian by mistake the first time.  Depressing.

1.  One general question for general discussion:

Is HLM6 faster than lmer? If so, why?

I'm always advocating R to students, but some faculty members are
skeptical.  A colleague compared the commercial HLM6 software to lmer.
 HLM6 seems to fit the model in 1 second, but lmer takes 60 seconds.

If you have HLM6 (I don't), can you tell me if you see similar differences?

My first thought was that LM6 uses PQL by default, and it would be
faster.  However, in the output, HLM6 says:

Method of estimation: restricted maximum likelihood

But that doesn't tell me what quadrature approach they use, does it?

Another explanation for the difference in time might be the way HLM6
saves the results of some matrix calculations and re-uses them behind
the scenes.  If every call to lmer is re-calculating some big matrix
results, I suppose that could explain it.

There are comparisons from 2006 here

http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/tables.shtml

that indicate that lme was much slower than HLM, but that doesn't help
me understand *why* there is a difference.

2. What does "stack imbalance in .Call" mean in lmer?

Here's why I ask.  Searching for comparisons of lmer and HLM,  I went
to CRAN &  I checked this document:

http://cran.r-project.org/web/packages/mlmRev/vignettes/MlmSoftRev.pdf

I *think* these things are automatically generated.  The version
that's up there at this moment  (mlmRev edition 0.99875-1)  has pages
full of the error message:

stack imbalance in .Call,

Were those always there?  I don't think so.   What do they mean?

3. In the HLM6 output, there is a message at the end of the variable list:

'%' - This level-1 predictor has been centered around its grand mean.
'$' - This level-2 predictor has been centered around its grand mean.

What effect does that have on the estimates?  I believe it should have
no effect on the fixed effect slope estimates, but it seems to me the
estimates of the variances of random parameters would be
changed.  In order to make the estimates from lmer as directly
comparable as possible, should I manually center all of the variables
before fitting the model?   I'm a little stumped on how to center a
multi-category factor before feeding it to lmer.  Know what I mean?

pj

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas