[R-sig-ME] Problems with parametric bootstrap, glmer
Gustaf Granath
Gustaf.Granath at ebc.uu.se
Fri Jan 27 12:58:55 CET 2012
Hi all,
A few times a have encountered a problem when I try to perform a
parametric bootstrap to obtain a "corrected" p-value for a fixed effect
using glmer, like (lmer(y~x+x1+x2+(1|ran)+(1|id),data,family="poisson").
I can run the model without errors but when a use simulate(model) as
response (y) I sometimes get the classic " Cholmod warning 'not positive
definite'". Even more peculiar is that when I use refit() (e.g. as
described in the lme4 documentation under 'simulate-mer'), then this
happens more often compared to if I use the update() function.
So I can get a p-value based on the LR (anova(model1,model2)) but if I
run a parametric bootstrap (n=1000), maybe only 600-700 runs converge.
My models are not particularly overparametrized (maximum 16 fix coef, 2
random, N=500), random effects are not close to zero and centering data
does not help (actually, it made it worse in some cases...?). Has anyone
else encountered this problem? I assume that it is wrong to use the
600-700 successful bootstrap runs to calculate a parametric bootstrap
P-value. Any alternative ideas how to proceed?
I tried to put together test code (see below), I hope it works. It
should show that update() works better than refit() but I wasnt able to
reproduce that update() can fail as well (maybe I didnt run it long
enough though).
Cheers,
Gustaf Granath (PhD student)
##############CODE
set.seed(100)
dat<-expand.grid(y=c(1:5),site=as.factor(c(1:10)),tree=c("spruce","pine"))
i=9
for (i in 0:i) {
dat$y[(1+i*5):(5+5*i)]<- rpois(5,(5+1*i))+round(rnorm(5,0,2))
dat$y[(51+i*5):(50+(5+5*i))]<- rpois(5,(9+1*i))+round(rnorm(5,0,2))
}
dat$cov<-rnorm(100,100,10)
dat$id<- 1:nrow(dat)
#run models
m1<-lmer(y~tree+cov+(1|site)+(1|id),dat,family="poisson")
m0<-lmer(y~cov+(1|site)+(1|id),dat,family="poisson")
#function for refit()
pboot2 <- function(m0,m1) {
s <- simulate(m0)
L0 <- logLik(refit(m0,s),REML=F)
L1 <- logLik(refit(m1,s),REML=F)
c(2*(L1-L0))
}
dist1<-replicate(10,pboot2(m0,m1))
#use update() instead (no errors now)
dist2=numeric(10)
for (i in 1:10) {
dat$s <- simulate(m0)$sim_1
m0.refit <- update(m0,s~.,data=dat)
L0 <- logLik(m0.refit,REML=F)
m1.refit <- update(m1,s~.,data=dat)
L1 <- logLik(m1.refit,REML=F)
dist2[i]<- c(2*(L1-L0))
}
#compare refit() and update()
dist1 #with refit()
dist2 #with update()
More information about the R-sig-mixed-models
mailing list