[R-sig-ME] Start values for glmer: Bug?

Adam D. I. Kramer adik at ilovebacon.org
Tue Apr 27 09:40:15 CEST 2010


On Mon, 26 Apr 2010, Ben Bolker wrote:

> Adam D. I. Kramer wrote:
>> On Mon, 26 Apr 2010, Ben Bolker wrote:
>>
>>>  I don't think we really need your data, because it's easy to come up with
>>> a reproducible example:
>>>
>>> library(lme4)
>>> gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
>>>             family = binomial, data = cbpp, verbose=TRUE)
>>>
>>> gm2 <- update(gm1,start=list(ST=gm1 at ST,fixef=fixef(gm1)),verbose=TRUE)
>>>
>>> ## I think what would be intended for setting only the
>>> ## ST parameters would be:
>>>
>>> gm3 <- update(gm1,start=list(herd=gm1 at ST),verbose=TRUE)
>>
>> ...gm3 appears to work, while gm2 throws the error.
>>
>> This also works:
>>
>> gm2 <- update(gm1,start=list(ST=gm1 at ST),verbose=TRUE)
>>
>> ...which suggests to me that the error is in the fixef specification, not
>> the ranef specification.
>
>  I think we need to look more closely.  "Causes a warning" and "works" are
> not the same (although "causes a warning even when there isn't really a
> problem" is still a bug).

Apologies for my imprecise wording. When I said "error" I meant "bug."

>  In the verbose output, e.g
>
> 17:     100.09586: 0.642260 -1.39854 -0.992335 -1.12868 -1.58031
> (the last line of the initial fit), the first element (before the colon)
> is deviance, the rest of the elements are (var-cov, fixed effects).
>
> start=list(ST=gm1 at ST,fixef=fixef(gm1))
>
> starts in the right place:
>
>    0:     100.09586: 0.642260 -1.39854 -0.992335 -1.12868 -1.58037

...should we worry about the last fixed effect being off by 6e-5 ? I would
expect them to be precisely the same.

Perhaps the output at 0 is the output of the estimated effects AFTER the 0th
iteration? That would explain the below as well.

Also, this run actually takes 6 iterations and comes up with a slightly
different solution:

6:     100.09586: 0.642261 -1.39853 -0.992333 -1.12867 -1.58032

...granted, the deviance is the same, but it seems odd to me that this would
take more than 1 iteration.

> start=list(herd=gm1 at ST[[1]])
>  doesn't give an error, but doesn't start in the right place either
> (it should set starting var-cov to 0.642260):
>
>  0:     101.94202: 0.845154 -1.26902 -1.17076 -1.30141 -1.78228

I agree that that should give an error. My reading of the syntax suggested
by ?glmer suggests that the syntax above is incorrect, and that these ones
are correct:

gm2 <- update(gm1,start=list(ST=gm1 at ST),verbose=TRUE)
   0:     100.74403: 0.642260 -1.26902 -1.17076 -1.30141 -1.78228

...there, I specified ST=, rather than herd=.

Specifying just c() also works as expected, so long as the c() contains only
the ST matrix:
gm2 <- update(gm1,start=c(gm1 at ST[[1]]),verbose=TRUE)
   0:     100.74403: 0.642260 -1.26902 -1.17076 -1.30141 -1.78228

...but specifying the fixef inside the c() (which seems like a reasonable
thing to do, given that the verbose output appears to be a vector...so we
might want a warning here too) also fails:

gm2 <- update(gm1,start=c(gm1 at ST[[1]],fixef(gm1)),verbose=TRUE)
   0:     101.94202: 0.845154 -1.26902 -1.17076 -1.30141 -1.78228

...I think this is illustrated a bit better if we complicate the random
effects structure:

cbpp$grp <- rep(1:2,each=28)
gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (period | herd) +
(period | grp), family = binomial, data = cbpp, verbose=TRUE)

...then gm1 at ST is:

gm1 at ST
[[1]]
             (Intercept)    period2       period3      period4
(Intercept)   0.9068296  0.0000000  0.000000e+00 0.000000e+00
period2      -1.3084100  0.6256717  0.000000e+00 0.000000e+00
period3      -1.3421617  2.5756482  2.928284e-05 0.000000e+00
period4      -0.9121509 -0.8045486 -4.480320e-01 7.996506e-06

[[2]]
             (Intercept)       period2   period3      period4
(Intercept)   0.3956145  0.000000e+00 0.0000000 0.000000e+00
period2      -1.1822589  5.288660e-06 0.0000000 0.000000e+00
period3       0.3188161 -1.906580e-02 0.0000000 0.000000e+00
period4      -0.4143081 -3.631300e-02 0.0331796 3.338474e-07

...with [[1]] being the herd effect and [[2]] being the grp effect. Then
fitting gives us:

...
160:     78.533021: 0.906830 0.625672 2.92828e-05 7.99651e-06 -1.30841
-1.34216 -0.912151  2.57565 -0.804549 -0.448032 0.395614 5.28866e-06
0.00000 3.33847e-07 -1.18226 0.318816 -0.414308 -0.0190658 -0.0363130
0.0331796 -1.55707 -1.10585 -1.86776 -1.62993

...which looks to me to be:
  c( diag(gm1 at ST[[1]]),
     gm1 at ST[[1]][lower.tri(gm1 at ST[[1]])],
     diag(gm1 at ST[[2]]),
     gm1 at ST[[2]][lower.tri(gm1 at ST[[2]])],
     fixef(gm1)
   )

...in this case, I can't figure out a syntax that gets the first iteration
of gm2 to match the above. The failures are all the same, which is to say,
they use the same starting values glmer() creates without input.

This one, I would expect to succeed:
gm2 <- update(gm1, start=list(ST=gm1 at ST,fixef=fixef(gm1)))
...but it fails:
   0:     105.57844: 0.845154  1.69031  1.69031  1.75412  0.00000  0.00000
0.00000  0.00000  0.00000  0.00000 0.308607 0.617213 0.617213 0.640513
0.00000  0.00000  0.00000  0.00000  0.00000  0.00000 -1.26902 -1.17076
-1.30141 -1.78228

This one also:
gm2 <- update(gm1, start=gm1 at ST)
...also fails:
   0:     105.57844: 0.845154  1.69031  1.69031  1.75412  0.00000  0.00000
0.00000  0.00000  0.00000  0.00000 0.308607 0.617213 0.617213 0.640513
0.00000  0.00000  0.00000  0.00000  0.00000  0.00000 -1.26902 -1.17076
-1.30141 -1.78228

...the closest I can get is:

gm2 <- update(gm1, start= c( diag(gm1 at ST[[1]]),
     gm1 at ST[[1]][lower.tri(gm1 at ST[[1]])],
     diag(gm1 at ST[[2]]),
     gm1 at ST[[2]][lower.tri(gm1 at ST[[2]])] ) )

...but that only sets the random-effect start values correctly, not the
fixed effects. Adding fixef(gm1) to the end or passing the above as the ST
component of a list throws errors.

>  I could dig in and fix this according to my understanding, but I'd
> rather wait for Martin Maechler or Doug Bates to chime in ...

Agreed, though if our understanding is off by much, I would quite like to
know why.

--Adam




More information about the R-sig-mixed-models mailing list