[R-sig-ME] Bug / problems in optimizer of lmer.

Wed Feb 3 18:43:44 CET 2010

Dear Douglas Bates,

Thanks for the extremely fast and detailed answer.

Of course, as you said, 0 is well within any reasonable confidence interval on the standard deviation of the random effects and therefore this is not a real problem for that model and data set. It might be a bit of a pathologic data set, but I still ask myself why the optimizer is trapped around zero. I do not believe this is a real local minimum but rather a numeric artifact looking like a local minimum or a problem with the optimizer. I tried to evaluate and plot the deviance without fitting but I could not download the lme4a package.
I am kind of a newbie to R, after a google search I tried

	install.packages("lme4a", repos="http://R-Forge.R-project.org")

but that did not work out (package 'lme4a' is not available).

Finally, I guess if this kind sticking to zero only happens for data with their minimum close to zero anyway, than this should not be a problem at all.

Thanks again for your help and all the best,

Oliver 

-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas Bates
Sent: Wednesday, February 03, 2010 4:29 PM
To: Oliver Duerr
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Bug / problems in optimizer of lmer.

On Wed, Feb 3, 2010 at 8:32 AM, Oliver Duerr <Oliver.Duerr at genedata.com> wrote:
> Hello all,
>
> First of all I don't know if I am in the right news group to report bug / suspicious behavior in lme4. If not I do apologize.
>
> I am currently developing a linear mixed model implementation. For 
> testing I used the model
>
> y ~ dose * day + (1 | comp)
>
> and compared my implementation with lmer from lme4_0.999375-32 for about 30'000 different values of y.
> Generally the comparison is quite good but for certain y values (approx 100) I found small differences. Looking closer at some of those cases  I observe that the optimization reaches 0 but then it  cannot "escape" from anymore.
>
> One example, reproducible with the data provided below:
>>fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE)
>  0:    -21.202738: 0.245256
>  1:    -21.863496:  0.00000
>  2:    -21.863496: 4.14318e-06
>
> Plotting the one dimensional reduced deviance reveals a minimum at ~ 0.0862 (this minimum can also be found by providing a different start value).
>
>> fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE, start = 
>> c(0.05))
>  0:    -21.917896: 0.0500000
>  1:    -21.950001: 0.0784911
>  2:    -21.951063: 0.0913211
>  3:    -21.951911: 0.0860209
>  4:    -21.951913: 0.0862628
>  5:    -21.951913: 0.0862698
>
> In my implementation I had similar problems when I had to evaluate the likelihood at very small numbers it began to fluctuated and the optimizer had been trapped in a local minimum due to these numeric instabilities.
>
> Is this behavior known or do I maybe use lmer in a wrong way?

It is known in the sense that numerical optimization is not guaranteed to produce a global optimum.  At best it can produce a local optimum.
On my system the REML fit does converge to the optimum you indicate but, of course, 0 is well within any reasonable confidence interval on the standard deviation of the random effects.  The ML fit converges to an estimate of zero.

> print(fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE), corr = 
> FALSE)
  0:    -16.846009:  1.00000
  1:    -21.863496:  0.00000
  2:    -21.863496: 3.34671e-05
  3:    -21.864249: 0.00511679
  4:    -21.910197: 0.0451168
  5:    -21.951246: 0.0817081
  6:    -21.951470: 0.0899260
  7:    -21.951913: 0.0861839
  8:    -21.951913: 0.0862678
  9:    -21.951913: 0.0862697
Linear mixed model fit by REML
Formula: y ~ dose * day + (1 | comp)
  REML
-21.95

Random effects:
 Groups   Name        Variance   Std.Dev.
 comp     (Intercept) 0.00025129 0.015852
 Residual             0.03376472 0.183752
Number of obs: 133, groups: comp, 3

Fixed effects:
              Estimate Std. Error t value
(Intercept)  0.9553992  0.0183807   51.98
dose1       -0.0057256  0.0194846   -0.29
dose2        0.0189734  0.0112938    1.68
day1         0.0420788  0.0193691    2.17
day2         0.0032043  0.0113592    0.28
dose1:day1  -0.0057732  0.0237223   -0.24
dose2:day1  -0.0177054  0.0136961   -1.29
dose1:day2  -0.0148491  0.0138588   -1.07
dose2:day2   0.0007798  0.0080636    0.10
> print(fitML <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE, REML = 
> 0), corr = FALSE)
  0:    -73.180235:  1.00000
  1:    -81.824446:  0.00000
  2:    -81.824446:  0.00000
Linear mixed model fit by maximum likelihood
Formula: y ~ dose * day + (1 | comp)
    AIC    BIC logLik deviance
 -59.82 -28.03  40.91   -81.82

Random effects:
 Groups   Name        Variance Std.Dev.
 comp     (Intercept) 0.000000 0.00000
 Residual             0.031647 0.17790
Number of obs: 133, groups: comp, 3

Fixed effects:
              Estimate Std. Error t value
(Intercept)  0.9554630  0.0154320   61.91
dose1       -0.0057222  0.0188633   -0.30
dose2        0.0190337  0.0109334    1.74
day1         0.0420788  0.0187520    2.24
day2         0.0032680  0.0109970    0.30
dose1:day1  -0.0057732  0.0229664   -0.25
dose2:day1  -0.0177054  0.0132597   -1.34
dose1:day2  -0.0148456  0.0134166   -1.11
dose2:day2   0.0008401  0.0078059    0.11

Note that there are only three distinct levels of comp.  It is difficult to estimate a variance component with only three distinct levels of the grouping factor.

If you check the prediction intervals on the random effects from this model (plot enclosed) you will see that, even for the REML fit, zero is comfortably within the 95% confidence intervals on all the random effects for the comp factor.

Having said all this, there is a better way of performing the optimization when there is only one parameter in the profiled deviance or profiled REML criterion.  The development version of the lme4 package, called lme4a on the R-forge site, allows for the model structure to be returned without optimization.  It is packaged in such a way that the parameter estimates can be determined by optimizers other than nlminb, which is the default.

The optimize function is usually more effective for one-dimensional problems than are more general optimizers such as nlminb or bobyqa.
If we use optimize on this problem

> fit <- lmer(y ~ dose * day + (1 | comp), doFit = FALSE) 
> optimize(fit at setPars, interval = c(0,4))
$minimum
[1] 0.08624942

$objective
[1] -21.95191

> fit <- lmer(y ~ dose * day + (1 | comp), REML = 0, doFit = FALSE) 
> optimize(fit at setPars, interval = c(0,4))
$minimum
[1] 6.96079e-05

$objective
[1] -81.82445

we do get convergence to non-zero parameter values, although the differences in the objective (either the profiled REML criterion or the profiled deviance) at these values compared to those at zero are negligible.