[R-sig-ME] Status of the development package lme4a

Douglas Bates bates at stat.wisc.edu
Fri Jun 25 22:39:22 CEST 2010


I have often referred to the development version of the lme4 package,
called lme4a.  At the risk of annoying people who don't want to hear
more about a package that they can't yet use, I provide this update.

The sources for lme4a are available from the SVN archive on R-forge
but binary packages are not.  I hope that will change in the near
future.

I have switched to using the marvelous Rcpp package created by Dirk
Eddelbuettel and Romain François, which I heartily recommend to those
writing C++/C code to be loaded into R.  Recently Romain has been on a
"code rant" creating "syntactic sugar" that makes it much easier to
write expressions using R vectors in C++ and it has just been too
tempting for me to use these capabilities.  That is why binary
packages are not available. Some of the code in the lme4a package
depends on the "Rcpp du jour", more or less, and doesn't build on
systems like win-builder or R-forge because of that dependency.  When
Dirk and Romain are ready to release Rcpp_0.8.3 to CRAN we'll be able
to pursue making binary packages available.

Another change from lme4 to lme4a is the use of the bobyqa optimizer
from the minqa package, instead of the nlminb optimizer.  Generally I
have been pleased with the results from bobyqa but I am always on the
lookout for good optimizers that will handle nonlinear objective
functions subject to box constraints on the parameters.  The lme4a
code is constructed so that the user can create a function to evaluate
the deviance without doing the actual optimization to get the
parameter estimates.  This allows for experimentation with other
optimizers.  At this summer's useR! conference Stefan Theussl, Kurt
Hornik and David Meyer will talk about their R Optimization
Infrastructure package and I look forward to perhaps writing a generic
interface to several different optimizers through that.  (Note to
Stefan et al: and I would also like to write the interface glue for
the optimizers in the minqa package for ROI, once you document what
must be written.)

Generally I am pleased with both the quality of the results and the
speed of the package.  For glmer and nlmer there are two optimizations
- the first involving only the variance-component parameters and the
second involving the variance component parameters and the fixed
effects.  A value of 0 for the optional argument nAGQ suppresses the
second optimization, which can take much longer than the first.  In
many cases the second optimization doesn't improve the result much but
I have seen cases where the result from the second optimization is
considerably better than that from the first.  (It should always be at
least as good as the first because the converged values from the first
optimization are used as the starting values for the second.)  I
enclose an example where there is a big difference. This is a slight
modification of the R code in Doran, Bates, Bliese and Dowling
(http://www.jstatsoft.org/v20/i02).  The good news is that the results
from these model fits are better than the results quoted in that paper
(the bad news is that we should now post a correction).

As I mentioned in a thread started by Dave Atkins, the optional
argument nAGQ to glmer and nlmer can be given the value 0, in which
case a faster algorithm that iterates over the variance-component
parameters only is used.
-------------- next part --------------

R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> ###################################################
> ### chunk number 1: preliminaries
> ###################################################
> library("lme4a")
Loading required package: Matrix
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

    det

Loading required package: minqa
Loading required package: Rcpp

Attaching package: 'lme4a'

The following object(s) are masked from 'package:stats':

    AIC

> 
> 
> ###################################################
> ### chunk number 2: conversion
> ###################################################
> data("lq2002", package = "multilevel")
> wrk <- lq2002
> for (i in 3:16) wrk[[i]] <- ordered(wrk[[i]])
> for (i in 17:21) wrk[[i]] <- ordered(5 - wrk[[i]])
> lql <- within(reshape(wrk, varying = list(names(lq2002)[3:21]), v.names = "fivelev",
+                       idvar = "subj", timevar = "item", drop = names(lq2002)[c(2,22:27)],
+                       direction = "long"),
+           {
+               itype <- factor(ifelse(item < 12, "Leadership",
+                                      ifelse(item < 15, "Task Sig.", "Hostility")))
+               COMPID <- factor(COMPID)
+               item <- factor(item)
+               dichot <- factor(ifelse(fivelev < 4, 0, 1))
+               subj <- factor(subj)
+           })
> 
> 
> ###################################################
> ### chunk number 3: conv2
> ###################################################
> attr(lql,"reshapeLong") <- NULL
> lnkinv <- binomial()$linkinv
> 
> 
> ###################################################
> ### chunk number 4: lqlstr
> ###################################################
> str(lql)
'data.frame':	38798 obs. of  6 variables:
 $ COMPID : Factor w/ 49 levels "2","3","4","5",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ item   : Factor w/ 19 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ fivelev: Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 2 4 4 1 1 2 3 3 3 4 ...
 $ subj   : Factor w/ 2042 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ dichot : Factor w/ 2 levels "0","1": 1 2 2 1 1 1 1 1 1 2 ...
 $ itype  : Factor w/ 3 levels "Hostility","Leadership",..: 2 2 2 2 2 2 2 2 2 2 ...
> summary(lql)
     COMPID           item       fivelev        subj       dichot   
 13     : 1881   1      : 2042   1: 5208   1      :   19   0:19444  
 18     : 1786   2      : 2042   2: 5357   2      :   19   1:19354  
 46     : 1710   3      : 2042   3: 8879   3      :   19            
 15     : 1691   4      : 2042   4:11353   4      :   19            
 29     : 1615   5      : 2042   5: 8001   5      :   19            
 34     : 1482   6      : 2042             6      :   19            
 (Other):28633   (Other):26546             (Other):38684            
        itype      
 Hostility :10210  
 Leadership:22462  
 Task Sig. : 6126  
                   
                   
                   
                   
> 
> 
> ###################################################
> ### chunk number 5: fm1
> ###################################################
> (fm1 <- glmer(dichot ~ 0+itype+(1|subj)+(1|COMPID)+(1|item), lql, binomial, verbose = 1L))
npt = 7 , n =  3 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:  11:      40741.2; 1.35123 0.495665 0.564184 
  0.0020:  33:      40710.8; 1.50829 0.494469 0.531752 
 0.00020:  61:      40709.9; 1.51809 0.504002 0.615309 
 2.0e-05:  68:      40709.9; 1.51838 0.504114 0.613816 
 2.0e-06:  73:      40709.9; 1.51838 0.504112 0.613836 
 2.0e-07:  79:      40709.9; 1.51838 0.504114 0.613840 
At return
 86:     40709.884:  1.51838 0.504114 0.613840
npt = 12 , n =  6 
rhobeg =  0.3190403 , rhoend =  3.190403e-07 
   0.032:  13:      40709.9; 1.51838 0.504114 0.613840  1.59520 -0.473858 -0.131846 
  0.0032:  22:      40692.4; 1.57441 0.523403 0.634852  1.63328 -0.490345 -0.144524 
 0.00032:  39:      40689.2; 1.60306 0.534949 0.643919  1.67542 -0.495931 -0.141326 
 3.2e-05:  46:      40689.2; 1.60328 0.535336 0.644220  1.67589 -0.496257 -0.140769 
 3.2e-06:  55:      40689.2; 1.60332 0.535302 0.644204  1.67591 -0.496281 -0.140763 
 3.2e-07:  63:      40689.2; 1.60332 0.535303 0.644200  1.67591 -0.496282 -0.140763 
At return
 77:     40689.248:  1.60332 0.535302 0.644201  1.67591 -0.496282 -0.140764
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: binomial 
Formula: dichot ~ 0 + itype + (1 | subj) + (1 | COMPID) + (1 | item) 
   Data: lql 
      AIC       BIC    logLik  deviance 
 40701.25  40752.64 -20344.62  40689.25 

Random effects:
 Groups Name        Variance Std.Dev.
 subj   (Intercept) 2.5706   1.6033  
 COMPID (Intercept) 0.2865   0.5353  
 item   (Intercept) 0.4150   0.6442  
Number of obs: 38798, groups: subj, 2042; COMPID, 49; item, 19

Fixed effects:
                Estimate Std. Error z value
itypeHostility    1.6759     0.3026   5.538
itypeLeadership  -0.4963     0.2139  -2.320
itypeTask Sig.   -0.1408     0.3835  -0.367

Correlation of Fixed Effects:
            itypHs itypLd
itypeLdrshp 0.120        
itypeTskSg. 0.067  0.095 
> 
> 
> ###################################################
> ### chunk number 6: fm1ranef
> ###################################################
> rr <- ranef(fm1, postVar = TRUE)
> str(rr$COMPID)
'data.frame':	49 obs. of  1 variable:
 $ (Intercept): num  0.0332 -0.0456 -0.2194 0.0255 0.1012 ...
 - attr(*, "postVar")= num [1, 1, 1:49] 0.0893 0.0649 0.0567 0.047 0.1325 ...
> head(rr$COMPID)
  (Intercept)
2  0.03322195
3 -0.04559110
4 -0.21944834
5  0.02549450
6  0.10115283
7  0.36494211
> 
> 
> ###################################################
> ### chunk number 7: fm1subj
> ###################################################
> qq <- qqmath(rr)
> print(qq$subj)
> 
> 
> ###################################################
> ### chunk number 8: fm1comp
> ###################################################
> print(qq$COMPID)
> 
> 
> ###################################################
> ### chunk number 9: fm1item
> ###################################################
> print(qq$item)
> 
> 
> ###################################################
> ### chunk number 10: fm2
> ###################################################
> (fm2 <- glmer(dichot ~ 0 + itype + (1|subj) + (0+itype|COMPID) + (1|item), lql, binomial, verbose = 1L))
npt = 14 , n =  8 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:  20:      40350.7; 1.39862 0.542358 0.180288 0.147768 0.420898 0.0125334 0.506361 0.739096 
  0.0020:  60:      40312.0; 1.53517 0.639396 0.410296 0.397523 0.464948 -0.224284 0.487233 0.646907 
 0.00020:  92:      40311.8; 1.54308 0.632955 0.399188 0.418186 0.458937 -0.207663 0.488347 0.620140 
 2.0e-05: 170:      40311.8; 1.54448 0.626989 0.393000 0.413212 0.460822 -0.204481 0.492391 0.627449 
 2.0e-06: 245:      40311.8; 1.54449 0.626270 0.393088 0.410954 0.460661 -0.205147 0.491765 0.626785 
 2.0e-07: 296:      40311.8; 1.54447 0.626253 0.393063 0.410957 0.460655 -0.205147 0.491738 0.626749 
At return
309:     40311.791:  1.54447 0.626253 0.393063 0.410957 0.460655 -0.205147 0.491738 0.626749
npt = 17 , n =  11 
rhobeg =  0.3305596 , rhoend =  3.305596e-07 
   0.033:  18:      40311.8; 1.54447 0.626253 0.393063 0.410957 0.460655 -0.205147 0.491738 0.626749  1.65280 -0.484036 -0.144485 
  0.0033:  36:      40294.2; 1.60488 0.631657 0.423541 0.458069 0.488782 -0.215207 0.516055 0.648040  1.68190 -0.502535 -0.112069 
 0.00033:  80:      40287.9; 1.63652 0.667969 0.420028 0.438087 0.489400 -0.208066 0.519277 0.657647  1.74174 -0.507172 -0.148206 
 3.3e-05: 111:      40287.9; 1.63602 0.664944 0.421462 0.441305 0.489187 -0.207635 0.518033 0.658308  1.74150 -0.507175 -0.148653 
 3.3e-06: 250:      40287.8; 1.63673 0.666256 0.421678 0.443048 0.489561 -0.209605 0.516668 0.657986  1.74346 -0.506651 -0.148784 
 3.3e-07: 809:      40287.8; 1.63706 0.666668 0.421492 0.442301 0.489018 -0.210597 0.516027 0.658385  1.74491 -0.507136 -0.150170 
At return
1233:     40287.829:  1.63713 0.666671 0.421527 0.442126 0.489014 -0.210660 0.516197 0.658422  1.74498 -0.507134 -0.150225
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: binomial 
Formula: dichot ~ 0 + itype + (1 | subj) + (0 + itype | COMPID) + (1 |      item) 
   Data: lql 
      AIC       BIC    logLik  deviance 
 40309.83  40404.06 -20143.91  40287.83 

Random effects:
 Groups Name            Variance Std.Dev. Corr       
 subj   (Intercept)     2.6802   1.6371              
 COMPID itypeHostility  0.4445   0.6667              
        itypeLeadership 0.4168   0.6456   0.653      
        itypeTask Sig.  0.5063   0.7116   0.621 0.181
 item   (Intercept)     0.4335   0.6584              
Number of obs: 38798, groups: subj, 2042; COMPID, 49; item, 19

Fixed effects:
                Estimate Std. Error z value
itypeHostility    1.7450     0.3146   5.546
itypeLeadership  -0.5071     0.2242  -2.262
itypeTask Sig.   -0.1502     0.3976  -0.378

Correlation of Fixed Effects:
            itypHs itypLd
itypeLdrshp 0.108        
itypeTskSg. 0.064  0.040 
> 
> 
> ###################################################
> ### chunk number 11: fm2out
> ###################################################
> #print(fm2)
> 
> 
> ###################################################
> ### chunk number 12: fm3
> ###################################################
> (fm3 <- glmer(dichot ~ 0 + itype + (1|subj) + (1|COMPID:itype) + (1|item), lql, binomial, verbose = 1L))
npt = 7 , n =  3 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:  11:      40394.1; 1.35996 0.442839 0.896775 
  0.0020:  25:      40347.5; 1.51216 0.554390 0.975959 
 0.00020:  47:      40340.2; 1.54972 0.574367 0.624496 
 2.0e-05:  61:      40340.2; 1.54987 0.573398 0.626334 
 2.0e-06:  76:      40340.2; 1.54997 0.573608 0.626665 
 2.0e-07:  84:      40340.2; 1.54997 0.573613 0.626656 
At return
 91:     40340.235:  1.54997 0.573613 0.626656
npt = 12 , n =  6 
rhobeg =  0.3264886 , rhoend =  3.264886e-07 
   0.033:  13:      40340.2; 1.54997 0.573613 0.626656  1.63244 -0.516605 -0.182560 
  0.0033:  22:      40319.6; 1.60634 0.593426 0.650953  1.69948 -0.534045 -0.191250 
 0.00033:  49:      40316.4; 1.64318 0.613330 0.656907  1.71556 -0.540384 -0.184701 
 3.3e-05:  98:      40316.2; 1.64560 0.610322 0.656665  1.72500 -0.543570 -0.186336 
 3.3e-06: 124:      40316.2; 1.64651 0.610800 0.657226  1.72450 -0.543193 -0.187133 
 3.3e-07: 137:      40316.2; 1.64650 0.610804 0.657227  1.72450 -0.543193 -0.187144 
At return
161:     40316.198:  1.64649 0.610802 0.657233  1.72450 -0.543192 -0.187146
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: binomial 
Formula: dichot ~ 0 + itype + (1 | subj) + (1 | COMPID:itype) + (1 | item) 
   Data: lql 
      AIC       BIC    logLik  deviance 
 40328.20  40379.59 -20158.10  40316.20 

Random effects:
 Groups       Name        Variance Std.Dev.
 subj         (Intercept) 2.7109   1.6465  
 COMPID:itype (Intercept) 0.3731   0.6108  
 item         (Intercept) 0.4320   0.6572  
Number of obs: 38798, groups: subj, 2042; COMPID:itype, 147; item, 19

Fixed effects:
                Estimate Std. Error z value
itypeHostility    1.7245     0.3116   5.534
itypeLeadership  -0.5432     0.2216  -2.451
itypeTask Sig.   -0.1871     0.3934  -0.476

Correlation of Fixed Effects:
            itypHs itypLd
itypeLdrshp 0.024        
itypeTskSg. 0.014  0.020 
> (fm3a <- glmer(dichot ~ 0 + itype + (1|subj) + (1|COMPID:itype) + (1|COMPID) + (1|item), lql, binomial, verbose = 1L))
npt = 9 , n =  4 
rhobeg =  0.2 , rhoend =  2e-07 
   0.020:  14:      40364.4; 1.37711 0.501643 0.628101 0.816288 
  0.0020:  36:      40328.0; 1.50852 0.454115 0.437267 0.716678 
 0.00020:  78:      40326.0; 1.54465 0.472324 0.438092 0.629258 
 2.0e-05:  95:      40326.0; 1.54406 0.471497 0.433136 0.627481 
 2.0e-06: 111:      40326.0; 1.54409 0.471502 0.433098 0.627292 
 2.0e-07: 124:      40326.0; 1.54408 0.471497 0.433097 0.627273 
At return
134:     40325.999:  1.54408 0.471497 0.433097 0.627273
npt = 13 , n =  7 
rhobeg =  0.331217 , rhoend =  3.31217e-07 
   0.033:  14:      40326.0; 1.54408 0.471497 0.433097 0.627273  1.65608 -0.490604 -0.151649 
  0.0033:  17:      40326.0; 1.54408 0.471497 0.433097 0.627273  1.65608 -0.490604 -0.151649 
 0.00033:  46:      40302.0; 1.63538 0.496953 0.465459 0.659065  1.74911 -0.513849 -0.157569 
 3.3e-05:  63:      40302.0; 1.63667 0.497528 0.464641 0.659300  1.74920 -0.514326 -0.158874 
 3.3e-06:  72:      40302.0; 1.63669 0.497531 0.464599 0.659319  1.74923 -0.514321 -0.158871 
 3.3e-07:  86:      40302.0; 1.63669 0.497531 0.464597 0.659320  1.74923 -0.514329 -0.158881 
At return
 98:     40302.000:  1.63669 0.497531 0.464598 0.659319  1.74923 -0.514329 -0.158881
Generalized linear mixed model fit by maximum likelihood ['merMod']
 Family: binomial 
Formula: dichot ~ 0 + itype + (1 | subj) + (1 | COMPID:itype) + (1 | COMPID) +      (1 | item) 
   Data: lql 
      AIC       BIC    logLik  deviance 
 40316.00  40375.96 -20151.00  40302.00 

Random effects:
 Groups       Name        Variance Std.Dev.
 subj         (Intercept) 2.6788   1.6367  
 COMPID:itype (Intercept) 0.2475   0.4975  
 COMPID       (Intercept) 0.2159   0.4646  
 item         (Intercept) 0.4347   0.6593  
Number of obs: 38798, groups: subj, 2042; COMPID:itype, 147; COMPID, 49; item, 19

Fixed effects:
                Estimate Std. Error z value
itypeHostility    1.7492     0.3157   5.541
itypeLeadership  -0.5143     0.2266  -2.270
itypeTask Sig.   -0.1589     0.3971  -0.400

Correlation of Fixed Effects:
            itypHs itypLd
itypeLdrshp 0.088        
itypeTskSg. 0.050  0.070 
> 
> 
> ###################################################
> ### chunk number 13: fm23comp
> ###################################################
> anova(fm3,fm3a,fm2)
Data: lql
Models:
fm3: dichot ~ 0 + itype + (1 | subj) + (1 | COMPID:itype) + (1 | item)
fm3a: dichot ~ 0 + itype + (1 | subj) + (1 | COMPID:itype) + (1 | COMPID) + 
fm3a:     (1 | item)
fm2: dichot ~ 0 + itype + (1 | subj) + (0 + itype | COMPID) + (1 | 
fm2:     item)
     Df   AIC   BIC logLik  Chisq Chi Df Pr(>Chisq)    
fm3   6 40328 40380 -20158                             
fm3a  7 40316 40376 -20151 14.198      1  0.0001645 ***
fm2  11 40310 40404 -20144 14.171      4  0.0067686 ** 
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
> 
> 
> ###################################################
> ### chunk number 14: splom
> ###################################################
> rr2 <- ranef(fm2, postVar = TRUE)
> print(splom(rr2$COMPID))
> 
> 
> ###################################################
> ### chunk number 15: cat2
> ###################################################
> qq2 <- qqmath(rr2)
> print(qq2$COMPID)
> 
> 
> ###################################################
> ### chunk number 16: iParams
> ###################################################
> str(imap <- unique(lql[, c("itype", "item")]))
'data.frame':	19 obs. of  2 variables:
 $ itype: Factor w/ 3 levels "Hostility","Leadership",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ item : Factor w/ 19 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
> (easiness <- ranef(fm2)$item[[1]] + fixef(fm2)[imap$itype])
itypeLeadership itypeLeadership itypeLeadership itypeLeadership itypeLeadership 
    -0.41119010      0.37121913     -1.43984345     -0.69252265     -1.10502062 
itypeLeadership itypeLeadership itypeLeadership itypeLeadership itypeLeadership 
     0.20917647     -0.85582304      0.37127642     -1.22834689     -0.01874933 
itypeLeadership  itypeTask Sig.  itypeTask Sig.  itypeTask Sig.  itypeHostility 
    -0.75757333     -0.73730287      0.11498491      0.18156054      0.61385233 
 itypeHostility  itypeHostility  itypeHostility  itypeHostility 
     2.47447112      1.41594039      1.73665401      2.49778523 
> 
> 
> ###################################################
> ### chunk number 17: compParams
> ###################################################
> compPar <- t(fixef(fm2) + t(ranef(fm2)$COMPID))
> head(compPar)
  itypeHostility itypeLeadership itypeTask Sig.
2       1.481509     -0.72862096     0.62270773
3       2.137032     -0.75702385     0.16635924
4       1.494098     -1.06671635     0.55629019
5       2.061113     -0.99900157     1.17293550
6       1.948834      0.09073122    -1.00704904
7       2.283477     -0.02442879    -0.06122837
> 
> 
> ###################################################
> ### chunk number 18: compprob
> ###################################################
> head(binomial()$linkinv(compPar))
  itypeHostility itypeLeadership itypeTask Sig.
2      0.8148005       0.3254974      0.6508341
3      0.8944507       0.3192928      0.5414942
4      0.8166925       0.2560280      0.6355937
5      0.8870657       0.2691378      0.7636752
6      0.8753194       0.5226673      0.2675578
7      0.9074993       0.4938931      0.4846977
> 
> proc.time()
    user   system  elapsed 
1073.250   43.970 1205.453 


More information about the R-sig-mixed-models mailing list