[R] simulation

Scott Raynaud scott.raynaud at yahoo.com
Fri Dec 16 22:56:47 CET 2011


I'm using an R program (which I did not write) to simulate multilevel data 
(subjects in locations) used in power calculations. It uses lmer to fit a 
mixed logistic model to the simulated data based on inputs of means, 
variances, slopes and proportions:
 
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))

where modelformula is set up in another part of the program.  Locations are
treated as random and the model is random intercept only.  The program is 
set to run 1000 simulations.
 
I have temperature, five levels of gestational age (GA), birth wieght (BW) and four 
other categorical pedictors, all binary.  I scaled everything so that all my slopes are in the 
range of -5.2 to 1.6 and variances from .01 to .08.  I have a couple of categories 
of GA that have small probabilities (<.10).  I'm using a structured sampling approach
looking at 20, 60, 100, and 140 locations with a total n=75k.  The first looks like this:
 
            # groups   n
            5             800
            4             2239
            3             3678
            3             5117
            3             6557
            2             7996
Total     20           75000
 
As the level 2 sizes increase, the cell sizes decrease.  When I run this model in 
the simulation I get:
 
Warning: glm.fit: algorithm did not converge
 
every time the model is fit (I killed this long before it ran 1000 times).
 
I tried increasing the number of iterations to no avail.  I suspected linear 
dependencies among the predictors, so I took out GA (same result), put
GA back and took out BW (same result) and then took out both GA and
BW.  This ran about half the time with th other half passing warnings 
such as:
 
Warning: glm.fit: algorithm did not converge
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
 
or
 
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
 
in addition to some like the original warning.
 
If I leave everything in but temperature, then it runs fine.  I also tested the full 
model separately at 50 and 75 level 2 units each with total n=75k.  Nothing converged.
 
I want to include temperature, but I'm not sure what else to try.  Any 
suggestions?



More information about the R-help mailing list