[R-sig-ME] Convergence problem example
Paul Johnson
pauljohn32 at gmail.com
Wed Apr 25 16:56:51 CEST 2018
In the book Multilevel and Longitudinal Modeling using Stata,
Rabe-Hesketh and Skrondal have a lot of exercises and over the years
I've been trying to write Stata and R code to demonstrate
similarities/differences.
I've run into an example where Stata (either with builtin methods or
the addon gllamm) seems to think it gets estimates but lme4
diagnostics say there is a convergence failure. I want to impose on
you, get your advice about it. We see convergence warnings quite often
with lme4, but they are usually the "rescale your variables" errors,
not as blunt as this.
The first part retrieves "dairy.dta" from the book website
library(foreign)
library(lme4)
fn <- "dairy"
if (!file.exists(paste0(fn, ".dta12"))) {
download.file(paste0("http://www.stata-press.com/data/mlmus3/", fn, ".dta"),
destfile = paste0(fn, ".dta12"))
}
dairy <- read.dta(paste0(fn, ".dta12"))
#1
m1 <- glmer(fscr ~ lncfs + ai + heifer + (1 | cow),
family = binomial(link = "logit"), nAGQ = 30,
data = dairy)
summary(m1)
>From that, I see:
> m1 <- glmer(fscr ~ lncfs + ai + heifer + (1 | cow),
+ family = binomial(link = "logit"), nAGQ = 30,
+ data = dairy)
Warning messages:
1: 'rBind' is deprecated.
Since R version 3.2.0, base's rbind() should work fine with S4 objects
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00396932 (tol = 0.001,
component 1)
> summary(m1)
Generalized linear mixed model fit by maximum likelihood (Adaptive
Gauss-Hermite Quadrature, nAGQ = 30) [glmerMod]
Family: binomial ( logit )
Formula: fscr ~ lncfs + ai + heifer + (1 | cow)
Data: dairy
AIC BIC logLik deviance df.resid
4017.8 4047.9 -2003.9 4007.8 3022
Scaled residuals:
Min 1Q Median 3Q Max
-1.8136 -0.7652 -0.6479 1.0420 1.6213
Random effects:
Groups Name Variance Std.Dev.
cow (Intercept) 0.3419 0.5847
Number of obs: 3027, groups: cow, 1575
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.55693 0.43481 -3.581 0.000343 ***
lncfs 0.52221 0.10007 5.218 1.81e-07 ***
aiai -1.09560 0.12341 -8.878 < 2e-16 ***
heiferprimiparous -0.08259 0.09718 -0.850 0.395410
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) lncfs aiai
lncfs -0.965
aiai -0.193 -0.046
heifrprmprs -0.054 0.016 -0.051
convergence code: 0
Model failed to converge with max|grad| = 0.00396932 (tol = 0.001, component 1)
Stata output for comparison purposes, using gllamm:
. gllamm fscr lncfs ai heifer, i(cow) link(logit) fam(binom) adapt
Running adaptive quadrature
Iteration 0: log likelihood = -2004.6011
Iteration 1: log likelihood = -2003.9085
Iteration 2: log likelihood = -2003.9069
Adaptive quadrature has converged, running Newton-Raphson
Iteration 0: log likelihood = -2003.9069
Iteration 1: log likelihood = -2003.9069
Iteration 2: log likelihood = -2003.9064
Iteration 3: log likelihood = -2003.9064
number of level 1 units = 3027
number of level 2 units = 1575
Condition Number = 47.123877
gllamm model
log likelihood = -2003.9064
------------------------------------------------------------------------------
fscr | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lncfs | .5222157 .1000693 5.22 0.000 .3260834 .718348
ai | -1.095598 .1234099 -8.88 0.000 -1.337477 -.8537193
heifer | -.0825878 .0971811 -0.85 0.395 -.2730593 .1078837
_cons | -1.556961 .4348008 -3.58 0.000 -2.409155 -.7047673
------------------------------------------------------------------------------
Variances and covariances of random effects
------------------------------------------------------------------------------
***level 2 (cow)
var(1): .34188062 (.1263136)
------------------------------------------------------------------------------
Using the newer meglm that is provided with Stata
. meglm fscr lncfs ai heifer || cow: , family(binom) link(logit)
Fitting fixed-effects model:
Iteration 0: log likelihood = -2011.8253
Iteration 1: log likelihood = -2009.1421
Iteration 2: log likelihood = -2009.1412
Iteration 3: log likelihood = -2009.1412
Refining starting values:
Grid node 0: log likelihood = -2015.6021
Fitting full model:
Iteration 0: log likelihood = -2015.6021
Iteration 1: log likelihood = -2006.709
Iteration 2: log likelihood = -2003.9174
Iteration 3: log likelihood = -2003.9065
Iteration 4: log likelihood = -2003.9065
Mixed-effects GLM Number of obs = 3,027
Family: binomial
Link: logit
Group variable: cow Number of groups = 1,575
Obs per group:
min = 1
avg = 1.9
max = 5
Integration method: mvaghermite Integration pts. = 7
Wald chi2(3) = 103.86
Log likelihood = -2003.9065 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
fscr | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lncfs | .5222154 .1000694 5.22 0.000 .326083 .7183479
ai | -1.095597 .1234095 -8.88 0.000 -1.337476 -.8537191
heifer | -.0825878 .0971811 -0.85 0.395 -.2730593 .1078836
_cons | -1.556961 .4348012 -3.58 0.000 -2.409155 -.7047659
-------------+----------------------------------------------------------------
cow |
var(_cons)| .3418776 .1263105 .1657237 .7052721
------------------------------------------------------------------------------
LR test vs. logistic model: chibar2(01) = 10.47 Prob >= chibar2 = 0.0006
As far as I can see, parameter estimates are the same. I did not
compare conditional modes.
If lme4's converge warning is valid, then we have a concrete case
where Stata is reporting the wrong thing. I can see some upside there
:)
R details
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 17.10
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lme4_1.1-15 Matrix_1.2-14 foreign_0.8-69
loaded via a namespace (and not attached):
[1] minqa_1.2.4 MASS_7.3-49 compiler_3.4.4 tools_3.4.4
[5] Rcpp_0.12.15 splines_3.4.4 nlme_3.1-137 grid_3.4.4
[9] nloptr_1.0.4 lattice_0.20-35
--
Paul E. Johnson http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu
To write to me directly, please address me at pauljohn at ku.edu.
More information about the R-sig-mixed-models
mailing list