[R-sig-ME] [R] False convergence of a glmer model
Douglas Bates
bates at stat.wisc.edu
Tue Feb 16 21:32:04 CET 2010
On Tue, Feb 16, 2010 at 2:23 PM, Shige Song <shigesong at gmail.com> wrote:
> Dear Doug,
>
> Your argument makes a lot sense: after all, infant mortality is a rare
> event! I have two questions:
>
> 1) Is there a way to change the convergence criterion in a glmer model
> (to make it more tolerant)?
I'm not sure that is a good idea. If the linear predictor produces
probabilities that are so small that the deviance is insensitive to
the parameter values, what would it mean to quote estimates of those
parameters?
> 2) Do you see a better approach than mixed logistic regression model
> in estimating infant morality, given the fact that infant mortality is
> a rare event?
I don't know of other approaches myself. Others on the list (Ben?)
may have suggestions.
> On Tue, Feb 16, 2010 at 2:47 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
>> On Tue, Feb 16, 2010 at 9:38 AM, Shige Song <shigesong at gmail.com> wrote:
>>> Hi Doug,
>>>
>>> Thanks. Next time I will post it to the R-SIG0-mixed-models mailing
>>> list, as you suggested.
>>
>> I have added R-SIG-mixed-models to the cc: list. I suggest we drop
>> the cc: to R-help after this message.
>>
>>> With respect to your question, the answer is no, these parameters do
>>> not make sense. Here is the Stata output from "exactly" the same
>>> model:
>>
>>> . xi:xtlogit inftmort i.cohort, i(code)
>>> i.cohort _Icohort_1-3 (naturally coded; _Icohort_1 omitted)
>>>
>>> Fitting comparison model:
>>>
>>> Iteration 0: log likelihood = -1754.4476
>>> Iteration 1: log likelihood = -1749.3366
>>> Iteration 2: log likelihood = -1749.2491
>>> Iteration 3: log likelihood = -1749.2491
>>>
>>> Fitting full model:
>>>
>>> tau = 0.0 log likelihood = -1749.2491
>>> tau = 0.1 log likelihood = -1743.8418
>>> tau = 0.2 log likelihood = -1739.0769
>>> tau = 0.3 log likelihood = -1736.4914
>>> tau = 0.4 log likelihood = -1739.5415
>>>
>>> Iteration 0: log likelihood = -1736.4914
>>> Iteration 1: log likelihood = -1722.6629
>>> Iteration 2: log likelihood = -1694.9114
>>> Iteration 3: log likelihood = -1694.6509
>>> Iteration 4: log likelihood = -1694.649
>>> Iteration 5: log likelihood = -1694.649
>>>
>>> Random-effects logistic regression Number of obs = 21694
>>> Group variable: code Number of groups = 10789
>>>
>>> Random effects u_i ~ Gaussian Obs per group: min = 1
>>> avg = 2.0
>>> max = 9
>>>
>>> Wald chi2(2) = 8.05
>>> Log likelihood = -1694.649 Prob > chi2 = 0.0178
>>
>> Well, the quantities being displayed in the iteration output from
>> glmer are the deviance and the parameter values. This stata
>> log-likelihood corresponds to a deviance of 3389.3. It is possible
>> that glmer and stata don't measure the log-likelihood on the same
>> scale but, if not, then the estimates where glmer gets stalled are
>> producing a lower deviance of 2837.49.
>>
>> The reason that glmer is getting stalled is because of the coefficient
>> of -7.45883 for the intercept. This corresponds to a mean success
>> probability of 0.0005 for cohort 1. The stata coefficient estimate of
>> -5.214642 corresponds to a mean success probability of 0.0055 for
>> cohort 1, which is still very, very small. The success probabilities
>> for the other cohorts are going to be even smaller. What is the
>> overall proportion of zeros in the response?
>>
>> The optimization to determine the maximum likelihood estimates of the
>> coefficients is being done on the scale of the linear predictor. When
>> the values of the linear predictor become very small or very large,
>> the fitted values become insensitive to the coefficients. The fact
>> the one program converges and another one doesn't may have more to do
>> with the convergence criterion than with the quality of the fit.
>>
>>
>>> ------------------------------------------------------------------------------
>>> inftmort | Coef. Std. Err. z P>|z| [95% Conf. Interval]
>>> -------------+----------------------------------------------------------------
>>> _Icohort_2 | -.5246846 .1850328 -2.84 0.005 -.8873422 -.1620269
>>> _Icohort_3 | -.1424331 .140369 -1.01 0.310 -.4175513 .132685
>>> _cons | -5.214642 .1839703 -28.35 0.000 -5.575217 -4.854067
>>> -------------+----------------------------------------------------------------
>>> /lnsig2u | .9232684 .1416214 .6456956 1.200841
>>> -------------+----------------------------------------------------------------
>>> sigma_u | 1.586665 .1123528 1.381055 1.822885
>>> rho | .4335015 .0347791 .3669899 .5024984
>>> ------------------------------------------------------------------------------
>>> Likelihood-ratio test of rho=0: chibar2(01) = 109.20 Prob >= chibar2 = 0.000
>>>
>>> The difference is quite huge, and Stata did not have any difficulties
>>> estimating this model, which makes feel that I might get some very
>>> basic specification wrong in my R model...
>>>
>>> Best,
>>> Shige
>>>
>>> On Tue, Feb 16, 2010 at 10:29 AM, Douglas Bates <bates at stat.wisc.edu> wrote:
>>>> On Tue, Feb 16, 2010 at 9:05 AM, Shige Song <shigesong at gmail.com> wrote:
>>>>> Dear All,
>>>>
>>>>> I am trying to fit a 2-level random intercept logistic regression on a
>>>>> data set of 20,000 cases. The model is specified as the following:
>>>>
>>>>> m1 <- glmer(inftmort ~ as.factor(cohort) + (1|code), family=binomial, data=d)
>>>>
>>>>> I got "Warning message: In mer_finalize(ans) : false convergence (8)"
>>>>
>>>> That message means that the optimizer function, nlminb, got stalled.
>>>> It has converged but the point at which is has converged is not
>>>> clearly the optimum. In many cases this just indicates that the
>>>> optimizer is being overly cautious. However, it can also mean that
>>>> the problem is ill-defined.
>>>>
>>>> The fact the the second parameter is -7.46 is likely the problem. A
>>>> difference in the probability of infant mortality between levels of
>>>> cohort on the order of -7.5 on the logit scale is huge. Do the
>>>> estimated probabilities at this value of the parameters make sense?
>>>>
>>>> P.S. Questions of this sort may be more readily answered in the
>>>> R-SIG-mixed-models mailing list.
>>>>
>>>>> With the "verbose=TRUE" option, I was able to get the following
>>>>> iteration history:
>>>>>
>>>>> 0: 3456.4146: 1.15161 -3.99068 -0.498790 -0.122116
>>>>> 1: 3361.3370: 1.04044 -4.38172 -0.561756 -0.289991
>>>>> 2: 3303.7986: 1.48296 -4.40741 -0.566208 -0.259730
>>>>> 3: 3147.5537: 1.93037 -5.14388 -0.682530 -0.443006
>>>>> 4: 3123.6900: 2.10192 -5.18784 -0.685558 -0.428320
>>>>> 5: 2988.6287: 2.94890 -6.31023 -0.825286 -0.586282
>>>>> 6: 2958.3364: 3.25396 -6.88256 -0.316988 0.572428
>>>>> 7: 2853.7703: 4.22731 -7.44955 -0.279492 -0.294353
>>>>> 8: 2844.8476: 4.36583 -7.43902 -0.293111 -0.267308
>>>>> 9: 2843.2879: 4.39182 -7.44895 -0.298791 -0.265899
>>>>> 10: 2840.2676: 4.44288 -7.47103 -0.310477 -0.263945
>>>>> 11: 2839.0890: 4.46259 -7.48131 -0.315320 -0.263753
>>>>> 12: 2838.8550: 4.46649 -7.48344 -0.316292 -0.263745
>>>>> 13: 2838.3889: 4.47428 -7.48771 -0.318236 -0.263737
>>>>> 14: 2838.3703: 4.47459 -7.48788 -0.318314 -0.263738
>>>>> 15: 2838.2216: 4.47708 -7.48927 -0.318936 -0.263742
>>>>> 16: 2838.2157: 4.47718 -7.48932 -0.318961 -0.263742
>>>>> 17: 2838.2145: 4.47720 -7.48934 -0.318966 -0.263742
>>>>> 18: 2838.2121: 4.47724 -7.48936 -0.318976 -0.263742
>>>>> 19: 2838.2120: 4.47724 -7.48936 -0.318976 -0.263742
>>>>> 20: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 21: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 22: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 23: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 24: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 25: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 26: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 27: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 28: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 29: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 30: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 31: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 32: 2838.2118: 4.47724 -7.48936 -0.318977 -0.263742
>>>>> 33: 2837.8154: 4.46385 -7.47464 -0.495684 -0.263985
>>>>> 34: 2837.7613: 4.46641 -7.47053 -0.498335 -0.264014
>>>>> 35: 2837.6418: 4.47259 -7.46200 -0.501644 -0.264141
>>>>> 36: 2837.5982: 4.47485 -7.45928 -0.502598 -0.264214
>>>>> 37: 2837.5850: 4.47537 -7.45882 -0.502848 -0.264237
>>>>> 38: 2837.5307: 4.47674 -7.45848 -0.503216 -0.264313
>>>>> 39: 2837.5014: 4.47725 -7.45875 -0.503273 -0.264344
>>>>> 40: 2837.4955: 4.47735 -7.45881 -0.503284 -0.264350
>>>>> 41: 2837.4944: 4.47738 -7.45882 -0.503286 -0.264351
>>>>> 42: 2837.4941: 4.47738 -7.45882 -0.503287 -0.264351
>>>>> 43: 2837.4936: 4.47739 -7.45883 -0.503288 -0.264352
>>>>> 44: 2837.4935: 4.47739 -7.45883 -0.503288 -0.264352
>>>>> 45: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 46: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 47: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 48: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 49: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 50: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 51: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 52: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 53: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 54: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 55: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 56: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 57: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 58: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 59: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 60: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 61: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 62: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>> 63: 2837.4931: 4.47740 -7.45883 -0.503289 -0.264352
>>>>>
>>>>> By the way, the same model can be fitted using Stata using xtlogit and
>>>>> xtmelogit; a simpler model without the random component can be
>>>>> estimated using R as:
>>>>>
>>>>> m <- glm(inftmort ~ as.factor(cohort), family=binomial, data=d)
>>>>>
>>>>> I was also able to get highly consistent results via MCMC simulation
>>>>> using MCMCglmm.
>>>>>
>>>>> It will be greatly appreciated if someone can give me some hints where
>>>>> to look further. Thanks.
>>>>>
>>>>> Best,
>>>>> Shige
>>>>>
>>>>> BTW, sorry about the earlier post, which was caused by a mistake.
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>
>>
>
More information about the R-sig-mixed-models
mailing list