[R-sig-ME] Bug / problems in optimizer of lmer.

Wed Feb 3 16:29:02 CET 2010

On Wed, Feb 3, 2010 at 8:32 AM, Oliver Duerr <Oliver.Duerr at genedata.com> wrote:
> Hello all,
>
> First of all I don't know if I am in the right news group to report bug / suspicious behavior in lme4. If not I do apologize.
>
> I am currently developing a linear mixed model implementation. For testing I used the model
>
> y ~ dose * day + (1 | comp)
>
> and compared my implementation with lmer from lme4_0.999375-32 for about 30'000 different values of y.
> Generally the comparison is quite good but for certain y values (approx 100) I found small differences. Looking closer at some of those cases  I observe that the optimization reaches 0 but then it  cannot "escape" from anymore.
>
> One example, reproducible with the data provided below:
>>fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE)
>  0:    -21.202738: 0.245256
>  1:    -21.863496:  0.00000
>  2:    -21.863496: 4.14318e-06
>
> Plotting the one dimensional reduced deviance reveals a minimum at ~ 0.0862 (this minimum can also be found by providing a different start value).
>
>> fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE, start = c(0.05))
>  0:    -21.917896: 0.0500000
>  1:    -21.950001: 0.0784911
>  2:    -21.951063: 0.0913211
>  3:    -21.951911: 0.0860209
>  4:    -21.951913: 0.0862628
>  5:    -21.951913: 0.0862698
>
> In my implementation I had similar problems when I had to evaluate the likelihood at very small numbers it began to fluctuated and the optimizer had been trapped in a local minimum due to these numeric instabilities.
>
> Is this behavior known or do I maybe use lmer in a wrong way?

It is known in the sense that numerical optimization is not guaranteed
to produce a global optimum.  At best it can produce a local optimum.
On my system the REML fit does converge to the optimum you indicate
but, of course, 0 is well within any reasonable confidence interval on
the standard deviation of the random effects.  The ML fit converges to
an estimate of zero.

> print(fit <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE), corr = FALSE)
  0:    -16.846009:  1.00000
  1:    -21.863496:  0.00000
  2:    -21.863496: 3.34671e-05
  3:    -21.864249: 0.00511679
  4:    -21.910197: 0.0451168
  5:    -21.951246: 0.0817081
  6:    -21.951470: 0.0899260
  7:    -21.951913: 0.0861839
  8:    -21.951913: 0.0862678
  9:    -21.951913: 0.0862697
Linear mixed model fit by REML
Formula: y ~ dose * day + (1 | comp)
  REML
-21.95

Random effects:
 Groups   Name        Variance   Std.Dev.
 comp     (Intercept) 0.00025129 0.015852
 Residual             0.03376472 0.183752
Number of obs: 133, groups: comp, 3

Fixed effects:
              Estimate Std. Error t value
(Intercept)  0.9553992  0.0183807   51.98
dose1       -0.0057256  0.0194846   -0.29
dose2        0.0189734  0.0112938    1.68
day1         0.0420788  0.0193691    2.17
day2         0.0032043  0.0113592    0.28
dose1:day1  -0.0057732  0.0237223   -0.24
dose2:day1  -0.0177054  0.0136961   -1.29
dose1:day2  -0.0148491  0.0138588   -1.07
dose2:day2   0.0007798  0.0080636    0.10
> print(fitML <- lmer(y ~ dose * day + (1 | comp), verbose=TRUE, REML = 0), corr = FALSE)
  0:    -73.180235:  1.00000
  1:    -81.824446:  0.00000
  2:    -81.824446:  0.00000
Linear mixed model fit by maximum likelihood
Formula: y ~ dose * day + (1 | comp)
    AIC    BIC logLik deviance
 -59.82 -28.03  40.91   -81.82

Random effects:
 Groups   Name        Variance Std.Dev.
 comp     (Intercept) 0.000000 0.00000
 Residual             0.031647 0.17790
Number of obs: 133, groups: comp, 3

Fixed effects:
              Estimate Std. Error t value
(Intercept)  0.9554630  0.0154320   61.91
dose1       -0.0057222  0.0188633   -0.30
dose2        0.0190337  0.0109334    1.74
day1         0.0420788  0.0187520    2.24
day2         0.0032680  0.0109970    0.30
dose1:day1  -0.0057732  0.0229664   -0.25
dose2:day1  -0.0177054  0.0132597   -1.34
dose1:day2  -0.0148456  0.0134166   -1.11
dose2:day2   0.0008401  0.0078059    0.11

Note that there are only three distinct levels of comp.  It is
difficult to estimate a variance component with only three distinct
levels of the grouping factor.

If you check the prediction intervals on the random effects from this
model (plot enclosed) you will see that, even for the REML fit, zero
is comfortably within the 95% confidence intervals on all the random
effects for the comp factor.

Having said all this, there is a better way of performing the
optimization when there is only one parameter in the profiled deviance
or profiled REML criterion.  The development version of the lme4
package, called lme4a on the R-forge site, allows for the model
structure to be returned without optimization.  It is packaged in such
a way that the parameter estimates can be determined by optimizers
other than nlminb, which is the default.

The optimize function is usually more effective for one-dimensional
problems than are more general optimizers such as nlminb or bobyqa.
If we use optimize on this problem

> fit <- lmer(y ~ dose * day + (1 | comp), doFit = FALSE)
> optimize(fit at setPars, interval = c(0,4))
$minimum
[1] 0.08624942

$objective
[1] -21.95191

> fit <- lmer(y ~ dose * day + (1 | comp), REML = 0, doFit = FALSE)
> optimize(fit at setPars, interval = c(0,4))
$minimum
[1] 6.96079e-05

$objective
[1] -81.82445

we do get convergence to non-zero parameter values, although the
differences in the objective (either the profiled REML criterion or
the profiled deviance) at these values compared to those at zero are
negligible.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplots.pdf
Type: application/pdf
Size: 2744 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20100203/fa5bd0d6/attachment.pdf>