[R-sig-ME] Bug / problems in optimizer of lmer.

Wed Feb 3 19:46:47 CET 2010

On Wed, Feb 3, 2010 at 12:18 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> On Wed, Feb 3, 2010 at 11:43 AM, Oliver Duerr <Oliver.Duerr at genedata.com> wrote:
>> Dear Douglas Bates,
>>
>> Thanks for the extremely fast and detailed answer.
>>
>> Of course, as you said, 0 is well within any reasonable confidence interval on the standard deviation of the random effects and therefore this is not a real problem for that model and data set. It might be a bit of a pathologic data set, but I still ask myself why the optimizer is trapped around zero. I do not believe this is a real local minimum but rather a numeric artifact looking like a local minimum or a problem with the optimizer. I tried to evaluate and plot the deviance without fitting but I could not download the lme4a package.
>> I am kind of a newbie to R, after a google search I tried
>>
>>        install.packages("lme4a", repos="http://R-Forge.R-project.org")
>>
>> but that did not work out (package 'lme4a' is not available).
>
> Thanks for the "heads up".  I frequently don't know what packages have
> been prepared successfully on R-forge because I install these packages
> from the sources - several times on some days.  It looks as the
> Windows and Mac OS X packages are encountering conflicting definition
> in C++ and C headers.  The Linux packages are not being built because
> the available version of Matrix is out of date.  All these issues
> could be addressed given sufficient time, which, regrettably, I don't
> have right now.
>
> I'll see if Uwe's win-builder service can create binary Windows
> packages, although I may need to wait until Matrix_0.999375-35 gets on
> the archives.

To follow up on this, the win-builder succeeds in building a Windows
package but only for R-devel running on 64-bit Windows.  I'll see what
happens for R-patched and 32-bit R-devel once Matrix_0.999375-35 is
installed.  I think there will still be problems with respect to
R-patched because of the C++/C header files conflict.

>> Finally, I guess if this kind sticking to zero only happens for data with their minimum close to zero anyway, than this should not be a problem at all.
>>
>> Thanks again for your help and all the best,
>>
>> Oliver
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas Bates
>> Sent: Wednesday, February 03, 2010 4:29 PM
>> To: Oliver Duerr
>> Cc: r-sig-mixed-models at r-project.org
>> Subject: Re: [R-sig-ME] Bug / problems in optimizer of lmer.
>>
>> On Wed, Feb 3, 2010 at 8:32 AM, Oliver Duerr <Oliver.Duerr at genedata.com> wrote:
>>> Hello all,
>>>
>>> First of all I don't know if I am in the right news group to report bug / suspicious behavior in lme4. If not I do apologize.
>>>
>>> I am currently developing a linear mixed model implementation. For
>>> testing I used the model
>>>
>>> y ~ dose * day + (1 | comp)
>>>
>>> and compared my implementation with lmer from lme4_0.999375-32 for about 30'000 different values of y.
>>> Generally the comparison is quite good but for certain y values (approx 100) I found small differences. Looking closer at some of those cases  I observe that the optimization reaches 0 but then it  cannot "escape" from anymore.
>>>
>>> One example, reproducible with the data provided below:
>>>>fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE)
>>>  0:    -21.202738: 0.245256
>>>  1:    -21.863496:  0.00000
>>>  2:    -21.863496: 4.14318e-06
>>>
>>> Plotting the one dimensional reduced deviance reveals a minimum at ~ 0.0862 (this minimum can also be found by providing a different start value).
>>>
>>>> fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE, start =
>>>> c(0.05))
>>>  0:    -21.917896: 0.0500000
>>>  1:    -21.950001: 0.0784911
>>>  2:    -21.951063: 0.0913211
>>>  3:    -21.951911: 0.0860209
>>>  4:    -21.951913: 0.0862628
>>>  5:    -21.951913: 0.0862698
>>>
>>> In my implementation I had similar problems when I had to evaluate the likelihood at very small numbers it began to fluctuated and the optimizer had been trapped in a local minimum due to these numeric instabilities.
>>>
>>> Is this behavior known or do I maybe use lmer in a wrong way?
>>
>> It is known in the sense that numerical optimization is not guaranteed to produce a global optimum.  At best it can produce a local optimum.
>> On my system the REML fit does converge to the optimum you indicate but, of course, 0 is well within any reasonable confidence interval on the standard deviation of the random effects.  The ML fit converges to an estimate of zero.
>>
>>> print(fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE), corr =
>>> FALSE)
>>  0:    -16.846009:  1.00000
>>  1:    -21.863496:  0.00000
>>  2:    -21.863496: 3.34671e-05
>>  3:    -21.864249: 0.00511679
>>  4:    -21.910197: 0.0451168
>>  5:    -21.951246: 0.0817081
>>  6:    -21.951470: 0.0899260
>>  7:    -21.951913: 0.0861839
>>  8:    -21.951913: 0.0862678
>>  9:    -21.951913: 0.0862697
>> Linear mixed model fit by REML
>> Formula: y ~ dose * day + (1 | comp)
>>  REML
>> -21.95
>>
>> Random effects:
>>  Groups   Name        Variance   Std.Dev.
>>  comp     (Intercept) 0.00025129 0.015852
>>  Residual             0.03376472 0.183752
>> Number of obs: 133, groups: comp, 3
>>
>> Fixed effects:
>>              Estimate Std. Error t value
>> (Intercept)  0.9553992  0.0183807   51.98
>> dose1       -0.0057256  0.0194846   -0.29
>> dose2        0.0189734  0.0112938    1.68
>> day1         0.0420788  0.0193691    2.17
>> day2         0.0032043  0.0113592    0.28
>> dose1:day1  -0.0057732  0.0237223   -0.24
>> dose2:day1  -0.0177054  0.0136961   -1.29
>> dose1:day2  -0.0148491  0.0138588   -1.07
>> dose2:day2   0.0007798  0.0080636    0.10
>>> print(fitML <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE, REML =
>>> 0), corr = FALSE)
>>  0:    -73.180235:  1.00000
>>  1:    -81.824446:  0.00000
>>  2:    -81.824446:  0.00000
>> Linear mixed model fit by maximum likelihood
>> Formula: y ~ dose * day + (1 | comp)
>>    AIC    BIC logLik deviance
>>  -59.82 -28.03  40.91   -81.82
>>
>> Random effects:
>>  Groups   Name        Variance Std.Dev.
>>  comp     (Intercept) 0.000000 0.00000
>>  Residual             0.031647 0.17790
>> Number of obs: 133, groups: comp, 3
>>
>> Fixed effects:
>>              Estimate Std. Error t value
>> (Intercept)  0.9554630  0.0154320   61.91
>> dose1       -0.0057222  0.0188633   -0.30
>> dose2        0.0190337  0.0109334    1.74
>> day1         0.0420788  0.0187520    2.24
>> day2         0.0032680  0.0109970    0.30
>> dose1:day1  -0.0057732  0.0229664   -0.25
>> dose2:day1  -0.0177054  0.0132597   -1.34
>> dose1:day2  -0.0148456  0.0134166   -1.11
>> dose2:day2   0.0008401  0.0078059    0.11
>>
>> Note that there are only three distinct levels of comp.  It is difficult to estimate a variance component with only three distinct levels of the grouping factor.
>>
>> If you check the prediction intervals on the random effects from this model (plot enclosed) you will see that, even for the REML fit, zero is comfortably within the 95% confidence intervals on all the random effects for the comp factor.
>>
>> Having said all this, there is a better way of performing the optimization when there is only one parameter in the profiled deviance or profiled REML criterion.  The development version of the lme4 package, called lme4a on the R-forge site, allows for the model structure to be returned without optimization.  It is packaged in such a way that the parameter estimates can be determined by optimizers other than nlminb, which is the default.
>>
>> The optimize function is usually more effective for one-dimensional problems than are more general optimizers such as nlminb or bobyqa.
>> If we use optimize on this problem
>>
>>> fit <- lmer(y ~ dose * day + (1 | comp), doFit = FALSE)
>>> optimize(fit at setPars, interval = c(0,4))
>> $minimum
>> [1] 0.08624942
>>
>> $objective
>> [1] -21.95191
>>
>>> fit <- lmer(y ~ dose * day + (1 | comp), REML = 0, doFit = FALSE)
>>> optimize(fit at setPars, interval = c(0,4))
>> $minimum
>> [1] 6.96079e-05
>>
>> $objective
>> [1] -81.82445
>>
>> we do get convergence to non-zero parameter values, although the differences in the objective (either the profiled REML criterion or the profiled deviance) at these values compared to those at zero are negligible.
>>
>