[R-sig-ME] Bleeding edge lme4 (or lme4a) plus DF estimation

Tue Feb 28 00:03:58 CET 2012

On Mon, Feb 27, 2012 at 4:48 PM, Joehanes, Roby (NIH/NHLBI) [F]
<roby.joehanes at nih.gov> wrote:
> Hi Ben:
>
> Thank you very much. I think I will familiarize myself with the lme4Eigen code for a while. A few more questions:
>
> 1. I noticed that there is lme4Eigen version 0.9996875-9 in the repository:
> http://lme4.r-forge.r-project.org/repos/src/contrib/
> Which SVN revision does it correspond to? I am currently using it and it appears to be quite stable.

The SVN revision number is stored in the DESCRIPTION file for recent
versions of lme4Eigen.  It is actually the revision number for the
last check-in of the DESCRIPTION file but that gets updated fairly
frequently so the revision number in there is a pretty tight lower
bound.

> 2. It seems that the code for revision 1618 isn't that much different than the HEAD branch (I believe 1621 at the moment). Is it really that buggy? I am primarily interested in the lmer, not glmer.

The current revision (1623) should compile.  The breakage was
something I did on Friday and needed to commit because it was only on
my laptop.  I realized later that I had a modified version of
RcppEigen on my laptop and without those modifications the compilation
croaked. I created a version that doesn't use those particular
modifications, at least for now.

> 3. (A different topic) Is there any way to speed up calling lmer on the same X and Z matrices but thousands of different y columns? In general linear model, we can invoke QR decomposition and use Q and R matrices to speed the calculations up. Is there such a decomposition (or method) that we can exploit to speed up the calculation?

Yes, check out the refit function.  I just saw that the documentation
suffers from cut-and-paste errors but the general idea is to give a
fitted model a new response and run only the optimization step.

> (fm1 <- lmer(Yield ~ 1|Batch, Dyestuff))
Linear mixed model fit by REML ['lmerMod']
Formula: Yield ~ 1 | Batch
   Data: Dyestuff

REML criterion at convergence: 319.6543

Random effects:
 Groups   Name        Variance Std.Dev.
 Batch    (Intercept) 1764     42.00
 Residual             2451     49.51
Number of obs: 30, groups: Batch, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)  1527.50      19.38    78.8
> refit(fm1, Dyestuff2$Yield)
Linear mixed model fit by REML ['lmerMod']
Formula: Yield ~ 1 | Batch
   Data: Dyestuff

REML criterion at convergence: 161.8283

Random effects:
 Groups   Name        Variance Std.Dev.
 Batch    (Intercept)  0.00    0.000
 Residual             13.81    3.716
Number of obs: 30, groups: Batch, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)   5.6656     0.6784   8.352

> Finally, thank you for the pointer on Kenward-Roger DF estimation.
>
> Sincerely,
> Roby
>
> --------
> Roby Joehanes
> Research Associate
> Roby.Joehanes at nih.gov
> Building 12A, Room 2007
> National Institutes of Health (NIH)
> Bethesda, MD 20892
> P: (301) 402-8702
> F: (301) 480-0028 or (301) 402-2867
>
>
> On Feb 24, 2012, at 8:50 PM, Ben Bolker wrote:
>
>> Joehanes, Roby (NIH/NHLBI) [F] <roby.joehanes at ...> writes:
>>
>>> I learned about the impending release of the new version of lme4
>>> (or lme4a) from Dr. Bates' post here:
>>> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2012q1/007499.html
>>>
>>> Firstly, just to make sure, is this based on lme4a?
>>> I have the source tarball of lme4a 0.999375-65 and found
>>> that a new optimizer, BOBYQA, is now in use, instead of the nlminb.
>>> The announcement above also mentioned
>>> the same change.
>>> So, I suspect that the older lme4 is supplanted with lme4a. Is this true?
>>
>>  I'm not quite sure what you mean.  There are three versions of lme4:
>>
>> * lme4 "classic" (version 0.999375-42 is the latest) lme4, built on nlminb
>>  (as was Bates's previous mixed-models package, nlme)
>>
>> *** lme4 will be preserved on CRAN and (probably) renamed "lme4.0" after
>> the new version (see below) is released as lme4, for users who
>> need backward compatibility (we thought this was a reasonable
>> name: there are still a few issues with package names containing
>> dots, so we may need to change this ...).
>>
>> * lme4a (0.9996875-1) uses bobyqa
>>
>> * lmeEigen (to be renamed lme4 when released, sometime soon ...)  uses
>> a mixture of bobyqa and a new Nelder-Mead implementation that allows box
>> constraints, adapted from the nloptr package, which in turn wraps the
>> NLopt open-source optimization codes
>> (http://ab-initio.mit.edu/wiki/index.php/NLopt_Introduction).
>>
>>> Secondly, I would like to get a source tarball of the latest
>>> bleeding edge release to play with (or even
>>> read-only SVN access). I found from some sniff tests the lmer outputs of
>>> lme4a to be closely matched with
>>> those of SAS than those of the lme4 (with nlminb optimizer), except for
>>> the lack of p-values. I would love to
>>> play with the new version and even give you comparisons with the old
>>> version. The problem I am facing with
>>> lme4a 0.999375-65 is that it sometimes crashes (core dumps).
>>
>>  That's easy: just go here for instructions on SVN access:
>>
>> http://r-forge.r-project.org/scm/?group_id=60
>>
>> (the package is *so* bleeding edge that if you get it right now,
>> you should roll back to SVN release 1618; releases 1619+ are
>> currently broken ...)
>>
>>> Thirdly, I also would love to see the Satterthwaite or
>>> Kenward-Rogers DF estimation. I would like to try to add these
>>> features into lme4, if you will. I don't know much about the
>>> formulas to compute the DFs from quantities output by lmer /
>>> glmer. Any pointers?
>>
>> Doug Bates is on record as saying that adapting the Kenward-Roger
>> formulation to work with his code would be difficult
>> <https://stat.ethz.ch/pipermail/r-help/2008-February/155372.html>,
>> but Ulrich Halekoh and Søren Højsgaard have an implementation in
>> the pbkrtest package
>> <http://cran.r-project.org/web/packages/pbkrtest/index.html>
>> which you could look at.  Perhaps that would give you a hint about
>> a Satterthwaite implementation as well ...
>>
>> [PS there is no "s" in "Kenward-Roger" -- this is a very common mistake,
>> Kenward-Roger gets 1.03 million google hits while Kenward-Rogers
>> gets 438K ...]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models