[R-sig-ME] Probability CIs for Mixed Logistic Regression

Mon Dec 9 22:51:32 CET 2013

On 13-12-07 10:50 PM, Egor Ananyev wrote:
> Hello everyone,
> 
> I have a question on how to calculate confidence intervals for predicted
> proportions from a mixed effects logistic model with glmer. It's probably
> very basic, but I'm at my wit's end. Below is a simplified reproducible
> example. I have a single three-level categorical independent variable. The
> method that I use (based on 1.96*SE) doesn't seem to work, because the
> confidence intervals for significantly different effects overlap.
> 
> Here's the output of the code below:
>              Est.   SE     z      P       BSum   ME     BlSum  BuSum
> PSum   PlSum  PuSum
> (Intercept)  2.21   0.60   3.69  <0.001   2.21   1.17   1.04   3.38
> 0.90   0.74   0.97
> vision1     -2.41   0.91  -2.66   0.008  -0.20   1.78  -1.98   1.58
> 0.45   0.12   0.83
> vision2     -1.11   0.97  -1.14   0.253   1.10   1.91  -0.81   3.00
> 0.75   0.31   0.95
> 
> As you can see, (Intercept) and vision1 CIs overlap -- for both cumulative
> (sum) B and the resulting proportions. I killed a few weekends to try to
> solve this problem and couldn't. Your help would be greatly appreciated.
> 
> Thanks,
> --Egor
> 

# preparing the data set:
## file URL: https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv
## inputDir = 'C:/Dropbox/Computer/Eclipse/R/HT/_input/'
library(RCurl)
txt <- getURL("https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv")
ds = read.csv(textConnection(txt),header=TRUE)
ds$subject = as.factor(ds$subject)
ds$vision = as.factor(ds$vision)

# running the model:
library(lme4)
m = glmer(seen ~ vision + (1|subject), data = ds, family = 'binomial')
## better to use accessor methods if possible
msum = as.data.frame(coef(summary(m)))

contrMat <- matrix(c(1,0,0,1,1,0,1,0,1),byrow=TRUE,ncol=3)
msums <- contrMat %*% fixef(m)
# calculating confidence intervals for the estimates:
msum$BSum[1] = msum$Estimate[1]
msum$BSum[2:3] = msum$Estimate[1] + msum$Estimate[2:3]
all.equal(msum$BSum,c(msums))  ## TRUE

## BMB:  I don't know why you expect these calculations to work;
##  the correct calculation is on the variance-covariance matrix
msum$ME = msum$`Std. Error` * 1.96 # margin of error
msum$BlSum = msum$BSum - msum$ME # lower bound on B sum
msum$BuSum = msum$BSum + msum$ME # upper bound on B sum
msum$PSum = plogis(msum$BSum) # predicted probability
msum$PlSum = plogis(msum$BlSum) # lower bound on predicted probability
msum$PuSum = plogis(msum$BuSum) # upper bound on predicted probability

mvcov <- contrMat %*% vcov(m) %*% t(contrMat)
mstderr <- sqrt(diag(mvcov))
mlwr <- msums - 1.96*mstderr
mupr <- msums + 1.96*mstderr
mlwrpred <- plogis(mlwr)
muprpred <- plogis(mupr)

The easier way to do this (which isn't as generalizable
to arbitrary contrasts, but works if all you want to do
is predict values for each level):

m2  <- update(m, . ~ . - 1)  ## take out the intercept
all.equal(unname(fixef(m2)),c(msums), tol=1e-5)  ## TRUE
(cc <- confint(m2,method="Wald"))
all.equal(cbind(mlwr,mupr),unname(cc), tol=2e-5)  ## TRUE

The other thing to notice is that all of these confidence
intervals still **do** overlap.  Taking overlap of 95% confidence intervals
as indicating 95% difference is conservative: try searching for
"95% confidence intervals overlap conservative" on Google scholar ...

For what it's worth most of this question isn't GLMM-specific
or even GLM-specific ...