[R-sig-ME] Testing a new version of lmer
Douglas Bates
bates at stat.wisc.edu
Wed Feb 18 22:36:52 CET 2009
For some time I have been considering changing the parameterization
used by lmer for the covariance matrix of the random effects. Today a
person contacted me regarding an lmer fit that gave very different
answers from SAS PROC MIXED. It turns out that there were at least
two local optima for the REML criterion and PROC MIXED was picking up
the global maximum while lmer converged to a different local maximum.
The root cause was an group for which the response was very different
from the others and if you removed that group then lmer did fine.
Nevertheless I was chagrined to see lmer converge to a local optimum
and not a global optimum
This motivated me to make the change. I have been thinking about it
for months and it ended up taking about half an hour. I wish I would
have just done it when I thought of it.
The good news is that the modified code seems more reliable. The bad
news is that I made the change in an experimental version of the code
which is in the "allcoef" branch of the SVN archive at R-forge. (If
that sounds like gibberish to you, you can skip the rest of this
message.)
I think I will thread the change back into the released version of
lme4 but first I want to check that it is not harmful. The testing
that I have done indicates that it does at least as well as the
current version and usually better. I haven't uncovered any cases
where the current version converges but the modified version doesn't.
However, distributed testing is much more effective than my trying to
test different variations.
If you have a favorite linear mixed-effects example, preferably a
difficult fit, then I would appreciate your comparing the old method
and the new method. Here is a trivial example using the Dyestuff data
Old:
> (fm1 <- lmer(Yield ~ 1|Batch, Dyestuff, verb = 1))
0: 319.76562: 0.730297
1: 319.73549: 0.962389
2: 319.65735: 0.869461
3: 319.65441: 0.844025
4: 319.65428: 0.848469
5: 319.65428: 0.848327
6: 319.65428: 0.848324
Linear mixed model fit by REML
Formula: Yield ~ 1 | Batch
Data: Dyestuff
AIC BIC logLik deviance REMLdev
325.7 329.9 -159.8 327.4 319.7
Random effects:
Groups Name Variance Std.Dev.
Batch (Intercept) 1764.0 42.001
Residual 2451.3 49.510
Number of obs: 30, groups: Batch, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 1527.50 19.38 78.81
New:
> (fm1 <- lmer(Yield ~ 1|Batch, Dyestuff, verb = 1))
0: 319.79239: 1.00000
1: 319.65467: 0.925112
2: 319.65428: 0.920527
3: 319.65428: 0.921046
4: 319.65428: 0.921045
Linear mixed model fit by REML
Formula: Yield ~ 1 | Batch
Data: Dyestuff
AIC BIC logLik deviance REMLdev
325.7 329.9 -159.8 327.4 319.7
Random effects:
Groups Name Variance Std.Dev.
Batch (Intercept) 1764.0 42.001
Residual 2451.3 49.510
Number of obs: 30, groups: Batch, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 1527.50 19.38 78.8
(To save you wondering what the change is, the new method uses the
square root of the relative standard deviation whereas the old method
used the relative standard deviation. Also the starting values for
the new method are trivial. It always starts at 1.)
You can help with examples in one of two ways. If you are comfortable
installing a package from source then you can install the version of
lme4 from the allcoef branch and try both versions on the example as
above. If you are not up to checking out code from an SVN archive and
installing a source package then please perform the fit in the old
version with verb = 1 and send me the output and the data so I can
check out the fit using the new lme4 and compare it to the old fit.
Thanks in advance for your help.
More information about the R-sig-mixed-models
mailing list