[R-sig-ME] Probability CIs for Mixed Logistic Regression
Ben Bolker
bbolker at gmail.com
Mon Dec 9 22:51:32 CET 2013
On 13-12-07 10:50 PM, Egor Ananyev wrote:
> Hello everyone,
>
> I have a question on how to calculate confidence intervals for predicted
> proportions from a mixed effects logistic model with glmer. It's probably
> very basic, but I'm at my wit's end. Below is a simplified reproducible
> example. I have a single three-level categorical independent variable. The
> method that I use (based on 1.96*SE) doesn't seem to work, because the
> confidence intervals for significantly different effects overlap.
>
> Here's the output of the code below:
> Est. SE z P BSum ME BlSum BuSum
> PSum PlSum PuSum
> (Intercept) 2.21 0.60 3.69 <0.001 2.21 1.17 1.04 3.38
> 0.90 0.74 0.97
> vision1 -2.41 0.91 -2.66 0.008 -0.20 1.78 -1.98 1.58
> 0.45 0.12 0.83
> vision2 -1.11 0.97 -1.14 0.253 1.10 1.91 -0.81 3.00
> 0.75 0.31 0.95
>
> As you can see, (Intercept) and vision1 CIs overlap -- for both cumulative
> (sum) B and the resulting proportions. I killed a few weekends to try to
> solve this problem and couldn't. Your help would be greatly appreciated.
>
> Thanks,
> --Egor
>
# preparing the data set:
## file URL: https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv
## inputDir = 'C:/Dropbox/Computer/Eclipse/R/HT/_input/'
library(RCurl)
txt <- getURL("https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv")
ds = read.csv(textConnection(txt),header=TRUE)
ds$subject = as.factor(ds$subject)
ds$vision = as.factor(ds$vision)
# running the model:
library(lme4)
m = glmer(seen ~ vision + (1|subject), data = ds, family = 'binomial')
## better to use accessor methods if possible
msum = as.data.frame(coef(summary(m)))
contrMat <- matrix(c(1,0,0,1,1,0,1,0,1),byrow=TRUE,ncol=3)
msums <- contrMat %*% fixef(m)
# calculating confidence intervals for the estimates:
msum$BSum[1] = msum$Estimate[1]
msum$BSum[2:3] = msum$Estimate[1] + msum$Estimate[2:3]
all.equal(msum$BSum,c(msums)) ## TRUE
## BMB: I don't know why you expect these calculations to work;
## the correct calculation is on the variance-covariance matrix
msum$ME = msum$`Std. Error` * 1.96 # margin of error
msum$BlSum = msum$BSum - msum$ME # lower bound on B sum
msum$BuSum = msum$BSum + msum$ME # upper bound on B sum
msum$PSum = plogis(msum$BSum) # predicted probability
msum$PlSum = plogis(msum$BlSum) # lower bound on predicted probability
msum$PuSum = plogis(msum$BuSum) # upper bound on predicted probability
mvcov <- contrMat %*% vcov(m) %*% t(contrMat)
mstderr <- sqrt(diag(mvcov))
mlwr <- msums - 1.96*mstderr
mupr <- msums + 1.96*mstderr
mlwrpred <- plogis(mlwr)
muprpred <- plogis(mupr)
The easier way to do this (which isn't as generalizable
to arbitrary contrasts, but works if all you want to do
is predict values for each level):
m2 <- update(m, . ~ . - 1) ## take out the intercept
all.equal(unname(fixef(m2)),c(msums), tol=1e-5) ## TRUE
(cc <- confint(m2,method="Wald"))
all.equal(cbind(mlwr,mupr),unname(cc), tol=2e-5) ## TRUE
The other thing to notice is that all of these confidence
intervals still **do** overlap. Taking overlap of 95% confidence intervals
as indicating 95% difference is conservative: try searching for
"95% confidence intervals overlap conservative" on Google scholar ...
For what it's worth most of this question isn't GLMM-specific
or even GLM-specific ...
More information about the R-sig-mixed-models
mailing list