[R-sig-ME] Convergence problem example

Wed Apr 25 16:56:51 CEST 2018

In the book Multilevel and Longitudinal Modeling using Stata,
Rabe-Hesketh and Skrondal have a lot of exercises and over the years
I've been trying to write Stata and R code to demonstrate
similarities/differences.

I've run into an example where Stata (either with builtin methods or
the addon gllamm) seems to think it gets estimates but lme4
diagnostics say there is a convergence failure.  I want to impose on
you, get your advice about it. We see convergence warnings quite often
with lme4, but they are usually the "rescale your variables" errors,
not as blunt as this.

The first part retrieves "dairy.dta" from the book website

library(foreign)
library(lme4)

fn <- "dairy"
if (!file.exists(paste0(fn, ".dta12"))) {
    download.file(paste0("http://www.stata-press.com/data/mlmus3/", fn, ".dta"),
                  destfile = paste0(fn, ".dta12"))
}

dairy <- read.dta(paste0(fn, ".dta12"))

#1
m1 <- glmer(fscr ~ lncfs + ai + heifer + (1 | cow),
            family = binomial(link = "logit"), nAGQ = 30,
            data = dairy)
summary(m1)

>From that, I see:
> m1 <- glmer(fscr ~ lncfs + ai + heifer + (1 | cow),
+             family = binomial(link = "logit"), nAGQ = 30,
+             data = dairy)
Warning messages:
1: 'rBind' is deprecated.
 Since R version 3.2.0, base's rbind() should work fine with S4 objects
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.00396932 (tol = 0.001,
component 1)
> summary(m1)
Generalized linear mixed model fit by maximum likelihood (Adaptive
  Gauss-Hermite Quadrature, nAGQ = 30) [glmerMod]
 Family: binomial  ( logit )
Formula: fscr ~ lncfs + ai + heifer + (1 | cow)
   Data: dairy

     AIC      BIC   logLik deviance df.resid
  4017.8   4047.9  -2003.9   4007.8     3022

Scaled residuals:
    Min      1Q  Median      3Q     Max
-1.8136 -0.7652 -0.6479  1.0420  1.6213

Random effects:
 Groups Name        Variance Std.Dev.
 cow    (Intercept) 0.3419   0.5847
Number of obs: 3027, groups:  cow, 1575

Fixed effects:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -1.55693    0.43481  -3.581 0.000343 ***
lncfs              0.52221    0.10007   5.218 1.81e-07 ***
aiai              -1.09560    0.12341  -8.878  < 2e-16 ***
heiferprimiparous -0.08259    0.09718  -0.850 0.395410
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) lncfs  aiai
lncfs       -0.965
aiai        -0.193 -0.046
heifrprmprs -0.054  0.016 -0.051
convergence code: 0
Model failed to converge with max|grad| = 0.00396932 (tol = 0.001, component 1)

Stata output for comparison purposes, using gllamm:

. gllamm fscr lncfs ai heifer, i(cow) link(logit) fam(binom) adapt

Running adaptive quadrature
Iteration 0:    log likelihood = -2004.6011
Iteration 1:    log likelihood = -2003.9085
Iteration 2:    log likelihood = -2003.9069

Adaptive quadrature has converged, running Newton-Raphson
Iteration 0:   log likelihood = -2003.9069
Iteration 1:   log likelihood = -2003.9069
Iteration 2:   log likelihood = -2003.9064
Iteration 3:   log likelihood = -2003.9064

number of level 1 units = 3027
number of level 2 units = 1575

Condition Number = 47.123877

gllamm model

log likelihood = -2003.9064

------------------------------------------------------------------------------
        fscr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lncfs |   .5222157   .1000693     5.22   0.000     .3260834     .718348
          ai |  -1.095598   .1234099    -8.88   0.000    -1.337477   -.8537193
      heifer |  -.0825878   .0971811    -0.85   0.395    -.2730593    .1078837
       _cons |  -1.556961   .4348008    -3.58   0.000    -2.409155   -.7047673
------------------------------------------------------------------------------

Variances and covariances of random effects
------------------------------------------------------------------------------

***level 2 (cow)

    var(1): .34188062 (.1263136)
------------------------------------------------------------------------------

Using the newer meglm that is provided with Stata

. meglm fscr lncfs ai heifer || cow: , family(binom) link(logit)

Fitting fixed-effects model:

Iteration 0:   log likelihood = -2011.8253
Iteration 1:   log likelihood = -2009.1421
Iteration 2:   log likelihood = -2009.1412
Iteration 3:   log likelihood = -2009.1412

Refining starting values:

Grid node 0:   log likelihood = -2015.6021

Fitting full model:

Iteration 0:   log likelihood = -2015.6021
Iteration 1:   log likelihood =  -2006.709
Iteration 2:   log likelihood = -2003.9174
Iteration 3:   log likelihood = -2003.9065
Iteration 4:   log likelihood = -2003.9065

Mixed-effects GLM                               Number of obs     =      3,027
Family:                binomial
Link:                     logit
Group variable:             cow                 Number of groups  =      1,575

                                                Obs per group:
                                                              min =          1
                                                              avg =        1.9
                                                              max =          5

Integration method: mvaghermite                 Integration pts.  =          7

                                                Wald chi2(3)      =     103.86
Log likelihood = -2003.9065                     Prob > chi2       =     0.0000
------------------------------------------------------------------------------
        fscr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       lncfs |   .5222154   .1000694     5.22   0.000      .326083    .7183479
          ai |  -1.095597   .1234095    -8.88   0.000    -1.337476   -.8537191
      heifer |  -.0825878   .0971811    -0.85   0.395    -.2730593    .1078836
       _cons |  -1.556961   .4348012    -3.58   0.000    -2.409155   -.7047659
-------------+----------------------------------------------------------------
cow          |
   var(_cons)|   .3418776   .1263105                      .1657237    .7052721
------------------------------------------------------------------------------
LR test vs. logistic model: chibar2(01) = 10.47       Prob >= chibar2 = 0.0006

As far as I can see, parameter estimates are the same. I did not
compare conditional modes.

If lme4's converge warning is valid, then we have a concrete case
where Stata is reporting the wrong thing.  I can see some upside there
:)

R details

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 17.10

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] lme4_1.1-15    Matrix_1.2-14  foreign_0.8-69

loaded via a namespace (and not attached):
 [1] minqa_1.2.4     MASS_7.3-49     compiler_3.4.4  tools_3.4.4
 [5] Rcpp_0.12.15    splines_3.4.4   nlme_3.1-137    grid_3.4.4
 [9] nloptr_1.0.4    lattice_0.20-35

-- 
Paul E. Johnson   http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu

To write to me directly, please address me at pauljohn at ku.edu.