[R-sig-ME] Why do extremely similar parametric bootstraps on lmer() objects yield (not only marginally) different results?
Christian Brauner
christianvanbrauner at gmail.com
Sat May 3 13:59:48 CEST 2014
Hello all,
warning ahead: This will be quite a long post. But have faith, it is necessary.
I run parametric bootstraps in order to compare two models. I bootstrap the
likelihood ratio between the models. For anybody acquainted with this, the code
should be self explanatory. Model selection is more or less done and these are
exercises I like to perform with real data sets to get a feel for how different
softwares treat the same models. Unfortunately, I am not in a position to
provide the real data. A few words about the models though:
Both models are fit on a balanced data set which includes around 483 missing
values. There are slightly above 11196 observations (within: 70 subjects).
Every subject saw every item only one time. The dependent variable are reading
times; sentence_type and matching are crossed two-level factors. Context and
subject are treated as random effects. Context has 40 levels and (well you
guessed) subjects has 70 levels. (See this related post on:
http://stats.stackexchange.com/questions/94302/lmer-parametric-bootstrap-testing-for-fixed-effects):
Model formulae:
===============
# I only show the model formulae here because this piece stays fixed with
# every call to lmer(). The options change however, depending on the parametric
# bootstrap procedure used. Any change can be seen in the code:
mod_min := value ~ passung + (1 + satz_typ | vp) + (1 + satz_typ | kontext)
mod8 := value ~ passung + (1 + satz_typ | vp) + (1 + satz_typ | kontext)
I set up different ways to perform parametric bootstraps. I will describe the
different ways and explain what I changed every time and show the results. The
ultimate question will be: Why the different results? Especially in one of the
ways to perform a parametric bootstrap.
Method 1 (pbkrtest).
====================
1.1. lmer() model calls:
------------------------
mod_min <- lmer(formula = log(value)
~ 1
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = wort12_lmm2_melt,
REML = FALSE)
mod8 <- lmer(formula = log(value)
~ passung
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = wort12_lmm2_melt,
REML = FALSE)
1.2. Parametric Bootstrap:
--------------------------
# remove NAs otherwise PBmodcomp() will complain as it calls the simulate()
# function directly without the argument na.action = na.exclude
# Relevant: See Ben Bolkers comments on this:
# https://github.com/lme4/lme4/issues/189 dirty fix:
mod8_nona <- mod8
mod_min_nona <- mod_min
attr(x = mod8_nona at frame,
which = "na.action") <- NULL # removing na.action attribute from the model
attr(x = mod_min_nona at frame,
which = "na.action") <- NULL # removing na.action attribute from the model
library(pbkrtest)
pbmod_mod8_mod_min <- PBmodcomp(largeModel = mod8_nona,
smallModel = mod_min_nona,
nsim = 10000,
cl = cl)
Take home message PBmodcomp(): p.value = 0.003
1.3 Notable:
------------
na.action attribute has been removed from both models before simulate() call.
Method 2.
=========
2.1 lmer() model calls:
-----------------------
identical to Method 1 (pbkrtest).
2.2 Parametric Bootstrap:
-------------------------
mod <- mod8
modnull <- mod_min
# save the observed likelihood ratio test statistic
lrt_obs <- anova(modnull,
mod)$Chisq[2]
n_sim <- 10000
dattemp <- mod at frame
ptime <- system.time({
lrt_sim <- foreach (i = 1:n_sim,
.combine = "c",
.packages = "lme4") %dopar% {
ysim <- unlist(x = simulate(object = modnull))
modnullsim <- lmer(formula = ysim
~ 1
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = dattemp,
REML = FALSE) # Fit the null model.
modaltsim <- lmer(formula = ysim
~ passung
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = dattemp,
REML = FALSE) # Fit the alternative model.
anova(modnullsim,
modaltsim)$Chisq[2] # Save the likelihood ratio test statistic.
}
})[3]
sum(lrt_sim >= lrt_obs) + 1)/(n_sim + 1) = 0.00639936 # p.value
Take home message Method 2: p.value = 0.00639936
2.3 Notable:
------------
na.action has not been altered!
Method 3.
=========
3.1 lmer() model calls:
-----------------------
# differences explained in 3.3
mod_min <- lmer(formula = log(value)
~ 1
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = wort12_lmm2_melt_noa,
REML = FALSE)
mod8 <- lmer(formula = log(value)
~ passung
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = wort12_lmm2_melt_nona,
REML = FALSE)
3.2 Parametric Bootstrap:
-------------------------
indentical to Method 2
Take home message Method 3: p.value = 0.00659934
3.3 Notable:
------------
The models were fit on the original dataset which only included complete cases.
That is, on a data set that was altered by the following code:
wort12_lmm2_melt_nona <-
wort12_lmm2_melt[complete.cases(wort12_lmm2_melt), ]
Method 4.
=========
4.1 lmer() model calls:
-----------------------
# differences explained in 4.3
mod_min_naexclude <- lmer(formula = log(value)
~ 1
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = wort12_lmm2_melt,
REML = FALSE,
na.action = na.exclude)
mod8_naexclude <- lmer(formula = log(value)
~ passung
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = wort12_lmm2_melt,
REML = FALSE,
na.action = na.exclude)
4.2 Parametric Bootstrap:
-------------------------
mod <- mod8_naexclude
modnull <- mod_min_naexclude
lrt_obs <- anova(modnull,
mod)$Chisq[2] # Save the observed likelihood ratio test statistic.
n_sim <- 10000
ptime <- system.time({
lrt_sim <- foreach (i = 1:n_sim,
.combine = "c",
.packages = "lme4") %dopar% {
ysim <- unlist(simulate(object = modnull))
modnullsim <- refit(object = modnull,
newresp = ysim) # Fit the null-model.
modaltsim <- refit(object = mod,
newresp = ysim) # Fit the alternative model.
anova(modnullsim,
modaltsim)$Chisq[2] # Save the likelihood ratio test statistic.
}
})[3]
Take home message Method 4: p.value = 0.00429957
4.3 Notable:
------------
The models were fit with the option na.action = na.exclude.
Method 5.
=========
5.1 lmer() model calls:
-----------------------
# differences explained in 5.3
mod_min_naexclude <- lmer(formula = log(value)
~ 1
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = wort12_lmm2_melt_nona,
REML = FALSE)
mod8_naexclude <- lmer(formula = log(value)
~ passung
+ (1 + satz_typ | vp)
+ (1 + satz_typ | kontext),
data = wort12_lmm2_melt_nona,
REML = FALSE)
5.2 Parametric Bootstrap:
-------------------------
indentical to Method 4
Take home message Method 5: p.value = 0.0039996
5.3 Notable:
------------
The models were fit on the original dataset which only included complete cases.
That is, on a data set that was altered by the following code:
wort12_lmm2_melt_nona <-
wort12_lmm2_melt[complete.cases(wort12_lmm2_melt), ]
Question:
=========
Why do Method 2 and Method 3 yield such (in my opinion) p.values that deviate
by about 0.002 from all other methods?
(I think at least the explanation of why I do not see it is really simple: I
just have been looking to long at this code and I am missing the obvious.)
Two final remarks:
(1) I ran everything on a cluster. That’s why I repeated Methods 2 and Methods
3. The results they yield seem to be numerically stable.
(2) I plotted the likelihood ratio test statistics for all methods and compared
them to a chi-squared distribution with df=1, and their likelihood ratio test
density. Everything is normal. The chi-square assumption is correct for these
models
Thank you so much for your patience!
Best,
Christian Brauner
More information about the R-sig-mixed-models
mailing list