[R-sig-ME] LMER vs MLwiN

Fri Feb 17 15:49:32 CET 2012

W Robert Long <longrob604 at ...> writes:

> I'm new to using mixed models in R. Thus far I've been using MLwiN. I am 
> trying to duplicate the results in MLwiN of a logistic mixed effects model.
> 
> At the moment I have no covariates, and a data hierarchy of
> pupil within class within school within commune with random effects at 
> each level above pupil.
> 
> There are 9000 observations in total;
> 
> 300 classes
> 100 schools
> 20 communes
> These have been set as factors with as.factor(). I'm not sure if this is 
> correct as they were not categorical in MLwiN (they were just ints) but 
> I was getting this error before I did that:
> "Error: length(f1) == length(f2) is not TRUE"
> 
> I have tried to fit a model with glmer like this:
> 
> glmer(LOSS~(1|COMMUNE/SCHOOL/CLASS/PUPIL),data=dt,family=binomial(link = 
> "logit"))
> 
> However this generates the error
> "Error: cannot allocate vector of size 9.5 Gb"
> 
> I have also tried glmmPQL in the MASS package:
> glmmPQL(LOSS~1,data=dt, random = 
> ~1|COMMUNE/SCHOOL/CLASS/PUPIL,family=binomial(link = "logit"))
> 
> However this generates /completely/ wrong estimates so I can only assume 
> that I am specifying the model incorrectly in R.
> 

  Hmmm. I'm not quite sure what's wrong, since a made-up example
with the same structure as yours worked pretty well on my system
(13 seconds to run, on a Linux virtual machine with a *total* of
3G of RAM).

ncomm <- 20
nschool <- 100
nclass <- 300
ntot <- 9000

set.seed(101)

u.comm <- rnorm(ncomm)
u.school <- rnorm(nschool)
u.class <- rnorm(nclass)
u.obs <- rnorm(ntot)

dt <- expand.grid(commune=factor(1:ncomm),
                  school=factor(1:(nschool/ncomm)),
                  class=factor(1:(nclass/nschool)),
                  pupil=factor(1:(ntot/nclass)))

## here lower-level units are not numbered uniquely, but that's OK as
## long as we always using nested syntax to refer to them

eta <- with(dt,u.comm[commune]+u.school[commune:school]+
            u.class[commune:school:class]+u.obs)
dt$loss <- rbinom(ntot,plogis(eta),size=1)

summary(dt)

    commune     school   class        pupil           loss       
 1      : 450   1:1800   1:3000   1      : 300   Min.   :0.0000  
 2      : 450   2:1800   2:3000   2      : 300   1st Qu.:0.0000  
 3      : 450   3:1800   3:3000   3      : 300   Median :0.0000  
 4      : 450   4:1800            4      : 300   Mean   :0.4776  
 5      : 450   5:1800            5      : 300   3rd Qu.:1.0000  
 6      : 450                     6      : 300   Max.   :1.0000  
 (Other):6300                     (Other):7200

library(lme4)
t1 <- system.time(g1 <- glmer(loss~(1|commune/school/class/pupil),data=dt,
                        family=binomial))

library(MASS)
t2 <- system.time(g2 <- glmmPQL(loss~1,
                                random=~1|commune/school/class/pupil,data=dt,
                        family=binomial))
## Error in lme.formula(fixed = zz ~ 1, random = ~1 |
commune/school/class/pupil,  : 
##   nlminb problem, convergence error code = 1
##  message = iteration limit reached without convergence (10)

lmer appears to underestimate the true variances (which were all set
to 1.0):

unlist(VarCorr(g1))
## pupil:(class:(school:commune))         class:(school:commune) 
##                     0.0000000                      0.7556695 
##                 school:commune                        commune 
##                     0.5782637                      0.3326618 

I also tried this in the development version of lme4 (which has
very different machinery internally) and got the same answer,
at least to within about 0.3% ... (it took about 3 seconds longer).

  Can you export this data set and try it in MLWiN?