[R-sig-ME] Probability CIs for Mixed Logistic Regression

Tue Dec 10 14:42:41 CET 2013

Egor,

These papers may give you a start on finding relevant work about how to
interpret CI overlap. 

Cumming, G. (2009). Inference by eye: Reading the overlap of independent
confidence intervals. Statistics in Medicine, 28(2), 205-220. doi:
10.1002/sim.3471

Cumming, G. (2007). Inference by eye: Pictures of confidence intervals and
thinking about levels of confidence. Teaching Statistics, 29, 89-93.

Cumming, G., & Finch, S. (2005). Inference by eye: Confidence intervals and
how to read pictures of data. American Psychologist, 60(2), 170-180. doi:
10.1037/0003-066X.60.2.170

Steven J. Pierce, Ph.D.
Associate Director
Center for Statistical Training & Consulting (CSTAT)
Michigan State University
E-mail: pierces1 at msu.edu
Web: http://www.cstat.msu.edu 

-----Original Message-----
From: Egor Ananyev [mailto:egor.ananyev at gmail.com] 
Sent: Monday, December 09, 2013 8:32 PM
To: Ben Bolker
Cc: r-sig-mixed-models
Subject: Re: [R-sig-ME] Probability CIs for Mixed Logistic Regression

Hi Ben,

Thanks a lot! I've since tried to do the same with predict(), but, as you
noted, it doesn't include the function for estimating the standard error (<
https://github.com/lme4/lme4/issues/147>). So I also tried to do this with
a bootstrap method from ez package:

# with ez package (bootstrap):
library(ez)
preds = ezPredict(m, boot = TRUE)
ezPlot2(preds, x=vision)

But the intervals still overlap: <
https://dl.dropboxusercontent.com/u/9147994/ezPlot2.pdf>. I'll try to run
the search and see what I can find about conservative CIs...

Thanks again,
--Egor

On 10 December 2013 05:51, Ben Bolker <bbolker at gmail.com> wrote:

> On 13-12-07 10:50 PM, Egor Ananyev wrote:
> > Hello everyone,
> >
> > I have a question on how to calculate confidence intervals for predicted
> > proportions from a mixed effects logistic model with glmer. It's
probably
> > very basic, but I'm at my wit's end. Below is a simplified reproducible
> > example. I have a single three-level categorical independent variable.
> The
> > method that I use (based on 1.96*SE) doesn't seem to work, because the
> > confidence intervals for significantly different effects overlap.
> >
> > Here's the output of the code below:
> >              Est.   SE     z      P       BSum   ME     BlSum  BuSum
> > PSum   PlSum  PuSum
> > (Intercept)  2.21   0.60   3.69  <0.001   2.21   1.17   1.04   3.38
> > 0.90   0.74   0.97
> > vision1     -2.41   0.91  -2.66   0.008  -0.20   1.78  -1.98   1.58
> > 0.45   0.12   0.83
> > vision2     -1.11   0.97  -1.14   0.253   1.10   1.91  -0.81   3.00
> > 0.75   0.31   0.95
> >
> > As you can see, (Intercept) and vision1 CIs overlap -- for both
> cumulative
> > (sum) B and the resulting proportions. I killed a few weekends to try to
> > solve this problem and couldn't. Your help would be greatly appreciated.
> >
> > Thanks,
> > --Egor
> >
>
> # preparing the data set:
> ## file URL: https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv
> ## inputDir = 'C:/Dropbox/Computer/Eclipse/R/HT/_input/'
> library(RCurl)
> txt <- getURL("https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv")
> ds = read.csv(textConnection(txt),header=TRUE)
> ds$subject = as.factor(ds$subject)
> ds$vision = as.factor(ds$vision)
>
> # running the model:
> library(lme4)
> m = glmer(seen ~ vision + (1|subject), data = ds, family = 'binomial')
> ## better to use accessor methods if possible
> msum = as.data.frame(coef(summary(m)))
>
> contrMat <- matrix(c(1,0,0,1,1,0,1,0,1),byrow=TRUE,ncol=3)
> msums <- contrMat %*% fixef(m)
> # calculating confidence intervals for the estimates:
> msum$BSum[1] = msum$Estimate[1]
> msum$BSum[2:3] = msum$Estimate[1] + msum$Estimate[2:3]
> all.equal(msum$BSum,c(msums))  ## TRUE
>
> ## BMB:  I don't know why you expect these calculations to work;
> ##  the correct calculation is on the variance-covariance matrix
> msum$ME = msum$`Std. Error` * 1.96 # margin of error
> msum$BlSum = msum$BSum - msum$ME # lower bound on B sum
> msum$BuSum = msum$BSum + msum$ME # upper bound on B sum
> msum$PSum = plogis(msum$BSum) # predicted probability
> msum$PlSum = plogis(msum$BlSum) # lower bound on predicted probability
> msum$PuSum = plogis(msum$BuSum) # upper bound on predicted probability
>
> mvcov <- contrMat %*% vcov(m) %*% t(contrMat)
> mstderr <- sqrt(diag(mvcov))
> mlwr <- msums - 1.96*mstderr
> mupr <- msums + 1.96*mstderr
> mlwrpred <- plogis(mlwr)
> muprpred <- plogis(mupr)
>
> The easier way to do this (which isn't as generalizable
> to arbitrary contrasts, but works if all you want to do
> is predict values for each level):
>
> m2  <- update(m, . ~ . - 1)  ## take out the intercept
> all.equal(unname(fixef(m2)),c(msums), tol=1e-5)  ## TRUE
> (cc <- confint(m2,method="Wald"))
> all.equal(cbind(mlwr,mupr),unname(cc), tol=2e-5)  ## TRUE
>
> The other thing to notice is that all of these confidence
> intervals still **do** overlap.  Taking overlap of 95% confidence
intervals
> as indicating 95% difference is conservative: try searching for
> "95% confidence intervals overlap conservative" on Google scholar ...
>
> For what it's worth most of this question isn't GLMM-specific
> or even GLM-specific ...
>

	[[alternative HTML version deleted]]