[R-sig-ME] More naive questions: Speed comparisons? what is a "stack imbalance" in lmer? does lmer center variables?

Wed Sep 23 20:46:47 CEST 2009

Of course asreml is "not part of R", but it is certainly available in
R.  R's license allows for closed-source packages, just not on CRAN.
To call this "dishonest" is most peculiar.  Is REvolutionR acting
dishonestly with some of their offerings?

I'm a strong believer in collaborative development and open source,
but I believe there's room for closed development models too.  More
than that, I would even argue that it is *helpful* to R.  Remember
that MASS, Design, Hmisc, nlme, and survival all started with S-Plus.
Without the existence of S-Plus we probably would be using
xlisp-stat-nlme and have to deal with even more parentheses!  I'm sure
Doug could enlighten us with some interesting stories about how nlme
started as part of S-Plus.

>From the perspective of developing personal skills that are portable
to different platforms or careers or whatever, I wish I could use an
open source mixed models package, but neither nlme nor lme4 nor
MCMCglmm can fit models to large data sets with a variety of complex
variance structures, so I use asreml.

On a lighter note, I propose that the members of this list create the
"Doug Bates foundation" and establish funding for Doug to quit his day
job and spend his life finishing lme4.

Kevin Wright

On Wed, Sep 23, 2009 at 11:31 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
> Got to disagree with you, Kevin.  admb and asreml are not part of R,
> even in the general sense of R packages.  R is Open Source - they are
> not. Tacking on an R interface to proprietary software and saying it
> is available in R is misleading and dishonest.
>
> On Wed, Sep 23, 2009 at 8:54 AM, Kevin Wright <kw.stat at gmail.com> wrote:
>> Paul,
>>
>> It appears to me that the published timings you reference are
>> comparing the __nlme__ package with other software.  So the answer is
>> yes, nlme really is that slow for some models.  You are probably aware
>> that the __lme4__ package has faster algorithms.
>>
>> There are many ways to fit mixed models in R including nlme, lme4,
>> MCMCglmm, admb asreml, BUGS, etc.  If I was teaching a course, I would
>> try to expose students to at least two of those in some detail and
>> touch briefly on the others: nlme can fit a variety of complex
>> varaiance structures, lme4 has faster algorithms, asreml is the only
>> choice of animal/plant breeders and has commercial support, MCMCglmm
>> has some Bayesian aspects and can fit some heteroskedastic variance
>> structures, admb is used in Fish & Wildlife, etc.
>>
>> Mixed model fitting in R is definitely not a case of "one size fits all".
>>
>> Kevin Wright
>>
>>
>> On Wed, Sep 23, 2009 at 1:36 AM, Paul Johnson <pauljohn32 at gmail.com> wrote:
>>> Sent this to r-sig-debian by mistake the first time.  Depressing.
>>>
>>> 1.  One general question for general discussion:
>>>
>>> Is HLM6 faster than lmer? If so, why?
>>>
>>> I'm always advocating R to students, but some faculty members are
>>> skeptical.  A colleague compared the commercial HLM6 software to lmer.
>>>  HLM6 seems to fit the model in 1 second, but lmer takes 60 seconds.
>>>
>>> If you have HLM6 (I don't), can you tell me if you see similar differences?
>>>
>>> My first thought was that LM6 uses PQL by default, and it would be
>>> faster.  However, in the output, HLM6 says:
>>>
>>> Method of estimation: restricted maximum likelihood
>>>
>>> But that doesn't tell me what quadrature approach they use, does it?
>>>
>>> Another explanation for the difference in time might be the way HLM6
>>> saves the results of some matrix calculations and re-uses them behind
>>> the scenes.  If every call to lmer is re-calculating some big matrix
>>> results, I suppose that could explain it.
>>>
>>> There are comparisons from 2006 here
>>>
>>> http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/tables.shtml
>>>
>>> that indicate that lme was much slower than HLM, but that doesn't help
>>> me understand *why* there is a difference.
>>>
>>> 2. What does "stack imbalance in .Call" mean in lmer?
>>>
>>> Here's why I ask.  Searching for comparisons of lmer and HLM,  I went
>>> to CRAN &  I checked this document:
>>>
>>> http://cran.r-project.org/web/packages/mlmRev/vignettes/MlmSoftRev.pdf
>>>
>>> I *think* these things are automatically generated.  The version
>>> that's up there at this moment  (mlmRev edition 0.99875-1)  has pages
>>> full of the error message:
>>>
>>> stack imbalance in .Call,
>>>
>>> Were those always there?  I don't think so.   What do they mean?
>>>
>>> 3. In the HLM6 output, there is a message at the end of the variable list:
>>>
>>> '%' - This level-1 predictor has been centered around its grand mean.
>>> '$' - This level-2 predictor has been centered around its grand mean.
>>>
>>> What effect does that have on the estimates?  I believe it should have
>>> no effect on the fixed effect slope estimates, but it seems to me the
>>> estimates of the variances of random parameters would be
>>> changed.  In order to make the estimates from lmer as directly
>>> comparable as possible, should I manually center all of the variables
>>> before fitting the model?   I'm a little stumped on how to center a
>>> multi-category factor before feeding it to lmer.  Know what I mean?
>>>
>>> pj
>>>
>>> --
>>> Paul E. Johnson
>>> Professor, Political Science
>>> 1541 Lilac Lane, Room 504
>>> University of Kansas
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>