[R-sig-ME] lme4 convergence problem

Thu Jan 13 23:38:19 CET 2011

On Thu, Jan 13, 2011 at 3:39 PM, Murray Jorgensen
<maj at stats.waikato.ac.nz> wrote:
> Dear List & Maintainers,
>
> I am currently running some scripts for some of the data sets in the book by
> Andrew Gelman and Jennifer Hill. (2006). (Data Analysis Using Regression and
> Multilevel/Hierarchical Models. Cambridge University Press.) "arm"
>
> However I have encountered a convergence difference between two versions of
> R and its packages which seems to be a purely lme4 issue.
>
> I have been mostly using a Dell laptop running Windows XP. Previously I had
> been using R 2.11.1 but with packages at 2.10.1. Now I have brought R and
> the packages both up to 2.12.1.
>
> Here is my script: it uses the attached file which has itself been
> constructed using a a script from the "arm" website but is self-contained.
>
> library(lme4)
> heights.clean =
> read.table("C:\\Files\\R&Splus\\Gelman\\examples\\earnings\\heights.clean.txt",
>                header = TRUE)
> attach(heights.clean)
> y <- log(earn)
> x <- height
> n <- length(y)
> n.age <- 3
> n.eth <- 4
> age <- age.category
> # regression of log (earnings) on height, age, and ethnicity
> M1 <- lmer (y ~ x + (1 + x | eth))
> summary (M1)
> M1 <- lmer (y ~ x + (1 + x | eth), verbose = TRUE)

The fitted model is degenerate in that the variance-covariance matrix
of the random effects is singular.  In lme4a, which uses a different
representation and a different optimizer, there is no problem with
convergence.

> summary(M1 <- lmer(log(earn) ~ height + (height|eth), heights.clean, verbose=TRUE))
npt = 7 , n =  3
rhobeg =  0.2 , rhoend =  2e-07
   0.020:  14:      3124.46; 2.65271 -0.0623503  0.00000
  0.0020:  22:      3117.59; 2.66693 -0.0482882  0.00000
 0.00020:  35:      3110.40; 2.64588 -0.0411483  0.00000
 2.0e-05:  42:      3110.40; 2.64588 -0.0411483  0.00000
 2.0e-06:  45:      3110.40; 2.64588 -0.0411483  0.00000
 2.0e-07: 137:      3109.75; 1.53325 -0.0240737  0.00000
At return
161:     3109.7460:  1.53327 -0.0240739  0.00000
Linear mixed model fit by REML ['summary.mer']
Formula: log(earn) ~ height + (height | eth)
   Data: heights.clean
REML criterion at convergence: 3109.746

Random effects:
 Groups   Name        Variance  Std.Dev. Corr
 eth      (Intercept) 1.8682051 1.36682
          height      0.0004606 0.02146  -1.000
 Residual             0.7946718 0.89144
Number of obs: 1187, groups: eth, 4

Fixed effects:
            Estimate Std. Error t value
(Intercept)  6.80491    0.98175   6.931
height       0.04261    0.01526   2.792

Correlation of Fixed Effects:
       (Intr)
height -0.999

What you are seeing in different results is slight perturbations in
the REML criterion evaluation causing the optimizer to take a
different path.  Unfortunately, the optimizer used in lme4 is
hard-wired into the code and it is not easy to insert other
optimizers.

I would dearly love to say that lme4a will be released for general use
"soon" but that may not come to pass.  During the last couple of weeks
Romain Francois, Dominick Sampieri and I have been trying to debug
some gliches in the loading of Rcpp Modules, which are used in lme4a.
It turns out that one of the bugs is actually in the R code itself.
We know how to fix it and Luke Tierney just committed the patch to the
R-devel archive.  I don't know yet if there will be an R-2.12.2 patch
release with this fix.