[R-sig-ME] Prediction and bootstrapping with glmer

Ruben Arslan rubenarslan at gmail.com
Mon Mar 9 12:07:08 CET 2015

Dear list,

I want to inquire about the current state of affairs regarding bootMer and
refit (in lme4 1.1-7).

I am attaching a helper function I made to plot predictions for a factor
variable, maybe it's useful for someone else.
The function makes plots like this https://i.imgur.com/AlKCS9C.png
I plan to use this to visually demonstrate that it is justifiable to use a
linear predictor instead of a factor variable (and hence to simplify my
models) and also simply to show the effect in easily understood quantities
(type='response'), since I noticed at a conference that many people
interpreted my ORs as probabilities because they're smaller than 1,
regardless of intercept and me trying to explain.

I also made a few attempts to do this "the easy way" (?) using the
instructions here: http://glmm.wikidot.com/faq#predconf but my results did
not agree with the bootstrapped results and I liked the violins. I that FAQ
may be out-of-date.
Of course, given my new problems, now I'm not sure if it wasn't the
bootstrapped results that were off.

I am sometimes getting ## T, : some bootstrap runs failed (100/100)
and thought it might be related to this issue this
For me, it's basically either/or though, mostly all runs fail or all

I read there that there might be a computationally intensive workaround for
problems related to refit (using update instead).
I have to bootstrap my predictions on a cluster "anyway" so I would be able
to stomach a performance cost (if it's not huge and even more so if I were
able to use MPI for parallelisation instead of "just" multicore). However,
I've examined the bootMer code and consider it likely that I would botch
things up, if I tried to hotfix it into a self-made bootMer2.

I believe the line
            foo <- try(FUN(refit(x, ss[[i]])), silent = TRUE)

would have to make way for

           foo <- try(FUN({
            newdata = x at frame
            newdata[, names(fit at frame)[1] ] = ss[[i]]
}), silent = TRUE)

but I thought it smarter to ask here first, maybe I'm going about the
entirely wrong way.

Best regards,

Ruben Arslan

Georg August University Göttingen
Biological Personality Psychology
Georg Elias Müller Institute of Psychology
Goßlerstr. 14
37073 Göttingen
Tel.: +49 551 3920704

plot_factor_response = function(fit, factor_base = "paternalage.factor",
nsim = 100,
newdata # for context: I left out code where I generate newdata to only
contain one desired level for all predictor except "paternalage.factor"
) {
library(lme4); library(ggplot2); library(dplyr);
modelname = deparse(substitute(fit))
outcome = names(fit at frame)[1]

mypred = function(fit) predict(fit, type = "response", re.form = ~0,
newdata = newdata)
strapped = bootMer(fit, mypred, nsim = nsim, parallel = "multicore", ncpus
= parallel::detectCores())
 mstrapped = suppressMessages(as.data.frame(strapped) %>% melt()) # to
suppress "No id variables; using all as measure variables" which is desired
newdata$variable = 1:nrow(newdata)
mstrapped = merge(mstrapped, newdata, by = 'variable')

print( ggplot(mstrapped,  aes_string(x = factor_base, y = "value")) +
 geom_violin(colour = "transparent", fill = "#5ea16e", alpha = 0.3)  +
 geom_pointrange(stat = "summary", fun.data = "mean_sdl")+
 ggtitle(modelname) +
 scale_y_continuous(outcome)  +
 geom_smooth(aes(weight = 1/sd(value, na.rm = T), group = 1),method = "lm",
se = F, lty = "dashed") +
 class(x = mstrapped) = c("mboot",class(x = mstrapped))

	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list