[R-sig-ME] fixed effects estimates too small in binomial GLMM with low "success"-rate
Henrik Singmann
henrik.singmann at psychologie.uni-freiburg.de
Sun May 18 00:07:13 CEST 2014
Dear list,
I would really appreciate your input on an issue in which the fixed effects estimates of a binomial GLMM with relatively low rate of "successes" are too low (compared to the observed effect) by around .5%.
The data comes from six comparable experiments in which we measured participants' performance errors on a series of trials in a design with one factor with two levels ("a" and "b"). Overall the difference in error rate between the two levels was very small (~0.5%), but when analyzing all data (with participant and experiment as random effects) the effect of factor reached significance (p = .03), which is also consistent with other published data.
However, the estimated effects are around .5% lower than the observed mean error rates indicating that the model may perhaps not adequately describe the data. Furthermore, I also get a convergence warning ("Model failed to converge with max|grad| = 0.0013716 (tol = 0.001)").
####### code 1 ##########
require(lme4)
dat <- read.table("http://pastebin.com/raw.php?i=KWjD5EG3")
dat$experiment <- factor(dat$experiment)
# observed effects
aggregate(errors ~ factor, dat, mean) #unweighted
## factor errors
## 1 a 0.02266914
## 2 b 0.02772540
by(dat, dat$factor, function(x) with(x, sum(errors*weights)/sum(weights))) #weighted
## dat$factor: a
## [1] 0.0222343
## ------------------------------------------------------------
## dat$factor: b
## [1] 0.02777651
m1 <- glmer(errors ~ factor + (factor|id) + (1|experiment), dat, weights = dat$weights, family = binomial)
## Warnmeldung:
## In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0013716 (tol = 0.001)
coef(summary(m1))[2,, drop = FALSE]
## Estimate Std. Error z value Pr(>|z|)
## factorb 0.1788881 0.08005533 2.234556 0.02544653
# estimated effect:
logistic <- function(x) 1/(1+exp(-x))
logistic(cumsum(fixef(m1)))
## (Intercept) factorb
## 0.01825229 0.02174991
####### end code 1 ##########
Interestingly, I can replicate this behavior when simulating binomial data similar to the one I am having. When the overall mean is relatively small, the estimates from the GLMM are consistently too low. When increasing the overall mean, this consistent bias in the estimated effect seems to go away (and the observed effect gets larger due to the non-linear nature of the logistic model). The code of the simulation for one data set follows below.
My question is: Does anyone have an idea what I could do to reach a better agreement between observed and predicted values? I tried different link functions, but that didn't help. And I am unsure whether I can trust my model given this discrepancy.
Thanks a lot,
Henrik
##### code 2 #####
mu <- -3.7 # when larger, problem disappears
n_experiments <- 6
n_participants <- 20
n_trials <- 400
sigma_effect <- 0.2
effect_size <- 0.4
sigma_p <- 0.7
sigma_e <- 0.1
prob_level <- 0.5
# create data:
df <- expand.grid(experiment = factor(seq(1, n_experiments)), id = factor(seq(1, n_participants)), factor = c(-1, 1))
df$id <- with(df, experiment:id)
df <- merge(df, data.frame(experiment = seq(1, n_experiments), re_e = rnorm(n_experiments, sd = sigma_e)))
df <- merge(df, data.frame(id = levels(df$id), re_p = rnorm(length(levels(df$id)), sd = sigma_p)))
df <- merge(df, data.frame(id = levels(df$id), re_a = rnorm(length(levels(df$id)), sd = sigma_effect)))
df <- with(df, df[order(id, factor),])
tmp <- rbinom(n_experiments*n_participants, n_trials, prob_level)
tmp <- rep(tmp, each = 2)
tmp[c(FALSE, TRUE)] <- n_trials-tmp[c(FALSE, TRUE)]
df$errors <- mapply(function(x, y) rbinom(1, y, x), with(df, logistic(mu+re_p+(factor*(effect_size/2)+factor*re_a)+re_e)), tmp)/tmp
df$weights <- tmp
df$factor <- factor(df$factor)
aggregate(errors ~ factor, df, mean) #unweighted
## factor errors
## 1 -1 0.02352719
## 2 1 0.03718523
s1 <- glmer(errors ~ factor + (factor|id) + (1|experiment), df, weights = df$weights, family = binomial)
## Warnmeldung:
## In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0130648 (tol = 0.001)
coef(summary(s1))[2,, drop = FALSE]
# Estimate Std. Error z value Pr(>|z|)
# factor1 0.4765694 0.07578773 6.288213 3.211416e-10
logistic(cumsum(fixef(s1)))
# (Intercept) factor1
# 0.01849934 0.02946117
#### end code 2 ####
--
Dr. Henrik Singmann
Albert-Ludwigs-Universität Freiburg, Germany
http://www.psychologie.uni-freiburg.de/Members/singmann
More information about the R-sig-mixed-models
mailing list