[R-sig-ME] simulation

Fri Dec 23 14:48:06 CET 2011

Ran some more simulations and never could get lmer to converge.  I dumped
one of my simulation into SAS and received a message that the covariance 
matrix for the random effects was not positive definite.  Now this is pretty 
informative to me: it says that the variance of the random effects is zero 
(or negative).  Does that imply I can omit the random effects?  I'm still a bit 
suspicious since things run fine when I omit temperature.  Am I missing 
anything here?

----- Original Message -----
From: Scott Raynaud <scott.raynaud at yahoo.com>
To: "r-sig-mixed-models at r-project.org" <r-sig-mixed-models at r-project.org>
Cc: 
Sent: Monday, December 19, 2011 8:36 AM
Subject: Re: [R-sig-ME] simulation

I played around with this some more and did some simple simulations of 75 
level 2 units pairing temprature with each of my other predictors individually.
Still nothing converged.  Everything is fine with the full model minus temperature.

Temperature is an imprtant predictor which I want to include.  Can anyone think 
of anything else I can try besides omitting temperature?

----- Original Message -----
From: Scott Raynaud <scott.raynaud at yahoo.com>
To: "r-sig-mixed-models at r-project.org" <r-sig-mixed-models at r-project.org>
Cc: 
Sent: Friday, December 16, 2011 7:33 PM
Subject: simulation

I'm using an R program (which I did not write) to simulate multilevel data 
(subjects in locations) used in power calculations. It uses lmer to fit a 
mixed logistic model to the simulated data based on inputs of means, 
variances, slopes and proportions:

###~~~~~~~~~~~~~~~~~     Required packages  ~~~~~~~~~~~~~~~~~~~~~###
    library(MASS)
    library(lme4)
###~~~~~~~~~~~~~~~~~~~     Initial inputs    ~~~~~~~~~~~~~~~~~~~~###
set.seed(12345)
siglevel<-0.03
z1score<-abs(qnorm(siglevel))
simus<-1000
n1list<-matrix(c(800,2239,3678,5117,6557,7996,  267,746,1226,1706,2186,2665,  160,448,736,1023,1311,1599,  114,320,525,731,937,1142),nrow=6,ncol=4)
sizeclass<-matrix(c( 5, 4, 3, 3, 3, 2,  15,12, 9, 9, 9, 6,  25,20,15,15,15,10,  35,28,21,21,21,14),nrow=6,ncol=4)
n2low<-20
n2high<-140
n2step<-40
npred<-10
randsize<-1
beta<-c(-1.586,-2.469,-1.197,-2.194,-2.949,-3.628,-5.227,0.198,-0.371,0.202,1.634000)
betasize<-length(beta)
effectbeta<-abs(beta)
sgnbeta<-sign(beta)
xtype<-c(2.000000,1.000000,1.000000,1.000000,1.000000,2.000000,1.000000,1.000000,1.000000,1.000000)
xprob<-c(0,0.000000,0.102000,0.141900,0.244900,0.441600,0.000000,0.533200,0.826000,0.261400,0.280000)
meanpred<-c(0,3.590,0.000,0.000,0.000,0.000,1.032,0.000,0.000,0.000,0.000)
varpred<-c(0,.010,0.000,0.000,0.000,0.000,0.083,0.000,0.000,0.000,0.000)
varpred2<-c(0,0.000,0.000,0.000,0.000,0.000,0.021,0.000,0.000,0.000,0.000)
sigma2u<-matrix(c(0.250),randsize,randsize)
n1range<-as.vector(n1list)
n2range<-seq(n2low,n2high,n2step)
n1size<-1
n2size<-length(n2range)
totalsize<-n1size*n2size
finaloutput<-matrix(0,totalsize,6*betasize)
rowcount<-1
n1col<-1
##-------------------        Inputs for model fitting       ---------------##

fixname<-c("x0","x1","x2","x3","x4","x5","x6","x7","x8","x9","x10")
 fixform<-"1+x1+x2+x3+x4+x5+x6+x7+x8+x9+x10"
randform<-"(1|l2id)"
expression<-paste(c(fixform,randform),collapse="+")
modelformula<-formula(paste("y ~",expression))
data<-vector("list",2+length(fixname))
names(data)<-c("l2id","y",fixname)
...
## +++++++++++++++++++       Set up X matrix       +++++++++++++++++++  ##
            micpred<-rnorm(length,meanpred[2],sqrt(varpred[2]))
             macpred<-rnorm(n2,0,sqrt(varpred2[2]))
              x[,2]<-micpred+macpred[l2id]
               micpred<-rmultinom(length,1,c(.102,.1419,.2449,.4416,.0696)) 
               x[,3]<-micpred[1,]
               x[,4]<-micpred[2,]
               x[,5]<-micpred[3,]
               x[,6]<-micpred[4,]
            micpred<-rnorm(length,meanpred[7],sqrt(varpred[7]))
             macpred<-rnorm(n2,0,sqrt(varpred2[7]))
              x[,7]<-micpred+macpred[l2id]
               x[,8]<-rbinom(length,1,xprob[8])
               x[,9]<-rbinom(length,1,xprob[9])
               x[,10]<-rbinom(length,1,xprob[10])
               x[,11]<-rbinom(length,1,xprob[11])
##--------------------------------------------------------------## 
                   u<-mvrnorm(n2,rep(0,randsize),sigma2u)
                    fixpart<-x%*%beta
                     randpart<-rowSums(z*u[l2id,])
                      binomprob<-exp(fixpart+randpart)/(1+exp(fixpart+randpart))
                       y<-rbinom(length,1,binomprob)
##-------------------        Inputs for model fitting       ---------------##
                 data$l2id<-as.factor(l2id)
                 data$y<-y
                 data$x0<-x[,1]
                 data$x1<-x[,2]
                 data$x2<-x[,3]
                 data$x3<-x[,4]
                 data$x4<-x[,5]
                 data$x5<-x[,6]
                 data$x6<-x[,7]
                 data$x7<-x[,8]
                 data$x8<-x[,9]
                 data$x9<-x[,10]
                 data$x10<-x[,11]
###~~~~~~~~~~      Fitting the model using lmer funtion    ~~~~~~~~~~###
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))
...

where l2id is the level 2 sample size and the ellipses indicate loop code.  
Locations are treated as random and the model is random intercept only.  
The program is set to run 1000 simulations.

I have temperature, five levels of gestational age (GA), birth wieght (BW) and four 
other categorical pedictors, all binary.  I scaled everything so that all my slopes are in the 
range of -5.2 to 1.6 and variances from .01 to .08.  I have a couple of categories 
of GA that have small probabilities (<.10).  I'm using a structured sampling approach
looking at 20, 60, 100, and 140 locations with a total n=75k.  The first looks like this:

            # groups   n
            5             800
            4             2239
            3             3678
            3             5117
            3             6557
            2             7996
Total     20           75000

As the level 2 sizes increase, the cell sizes decrease.  When I run this model in 
the simulation I get:

Warning: glm.fit: algorithm did not converge

every time the model is fit (I killed this long before it ran 1000 times).

I tried increasing the number of iterations to no avail.  I suspected linear 
dependencies among the predictors, so I took out GA (same result), put
GA back and took out BW (same result) and then took out both GA and
BW.  This ran about half the time with th other half passing warnings 
such as:

Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

or

Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

in addition to some like the original warning.

If I leave everything in but temperature, then it runs fine.  I also tested the full 
model separately at 50 and 75 level 2 units each with total n=75k.  Nothing converged.

I want to include temperature, but I'm not sure what else to try.  I realize 20 level 2
units is probably too small.  I think the threshold would be crossed by 60,
however.  Any suggestions?

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models