[R-sig-ME] Probability CIs for Mixed Logistic Regression
Steven J. Pierce
pierces1 at msu.edu
Tue Dec 10 14:42:41 CET 2013
Egor,
These papers may give you a start on finding relevant work about how to
interpret CI overlap.
Cumming, G. (2009). Inference by eye: Reading the overlap of independent
confidence intervals. Statistics in Medicine, 28(2), 205-220. doi:
10.1002/sim.3471
Cumming, G. (2007). Inference by eye: Pictures of confidence intervals and
thinking about levels of confidence. Teaching Statistics, 29, 89-93.
Cumming, G., & Finch, S. (2005). Inference by eye: Confidence intervals and
how to read pictures of data. American Psychologist, 60(2), 170-180. doi:
10.1037/0003-066X.60.2.170
Steven J. Pierce, Ph.D.
Associate Director
Center for Statistical Training & Consulting (CSTAT)
Michigan State University
E-mail: pierces1 at msu.edu
Web: http://www.cstat.msu.edu
-----Original Message-----
From: Egor Ananyev [mailto:egor.ananyev at gmail.com]
Sent: Monday, December 09, 2013 8:32 PM
To: Ben Bolker
Cc: r-sig-mixed-models
Subject: Re: [R-sig-ME] Probability CIs for Mixed Logistic Regression
Hi Ben,
Thanks a lot! I've since tried to do the same with predict(), but, as you
noted, it doesn't include the function for estimating the standard error (<
https://github.com/lme4/lme4/issues/147>). So I also tried to do this with
a bootstrap method from ez package:
# with ez package (bootstrap):
library(ez)
preds = ezPredict(m, boot = TRUE)
ezPlot2(preds, x=vision)
But the intervals still overlap: <
https://dl.dropboxusercontent.com/u/9147994/ezPlot2.pdf>. I'll try to run
the search and see what I can find about conservative CIs...
Thanks again,
--Egor
On 10 December 2013 05:51, Ben Bolker <bbolker at gmail.com> wrote:
> On 13-12-07 10:50 PM, Egor Ananyev wrote:
> > Hello everyone,
> >
> > I have a question on how to calculate confidence intervals for predicted
> > proportions from a mixed effects logistic model with glmer. It's
probably
> > very basic, but I'm at my wit's end. Below is a simplified reproducible
> > example. I have a single three-level categorical independent variable.
> The
> > method that I use (based on 1.96*SE) doesn't seem to work, because the
> > confidence intervals for significantly different effects overlap.
> >
> > Here's the output of the code below:
> > Est. SE z P BSum ME BlSum BuSum
> > PSum PlSum PuSum
> > (Intercept) 2.21 0.60 3.69 <0.001 2.21 1.17 1.04 3.38
> > 0.90 0.74 0.97
> > vision1 -2.41 0.91 -2.66 0.008 -0.20 1.78 -1.98 1.58
> > 0.45 0.12 0.83
> > vision2 -1.11 0.97 -1.14 0.253 1.10 1.91 -0.81 3.00
> > 0.75 0.31 0.95
> >
> > As you can see, (Intercept) and vision1 CIs overlap -- for both
> cumulative
> > (sum) B and the resulting proportions. I killed a few weekends to try to
> > solve this problem and couldn't. Your help would be greatly appreciated.
> >
> > Thanks,
> > --Egor
> >
>
> # preparing the data set:
> ## file URL: https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv
> ## inputDir = 'C:/Dropbox/Computer/Eclipse/R/HT/_input/'
> library(RCurl)
> txt <- getURL("https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv")
> ds = read.csv(textConnection(txt),header=TRUE)
> ds$subject = as.factor(ds$subject)
> ds$vision = as.factor(ds$vision)
>
> # running the model:
> library(lme4)
> m = glmer(seen ~ vision + (1|subject), data = ds, family = 'binomial')
> ## better to use accessor methods if possible
> msum = as.data.frame(coef(summary(m)))
>
> contrMat <- matrix(c(1,0,0,1,1,0,1,0,1),byrow=TRUE,ncol=3)
> msums <- contrMat %*% fixef(m)
> # calculating confidence intervals for the estimates:
> msum$BSum[1] = msum$Estimate[1]
> msum$BSum[2:3] = msum$Estimate[1] + msum$Estimate[2:3]
> all.equal(msum$BSum,c(msums)) ## TRUE
>
> ## BMB: I don't know why you expect these calculations to work;
> ## the correct calculation is on the variance-covariance matrix
> msum$ME = msum$`Std. Error` * 1.96 # margin of error
> msum$BlSum = msum$BSum - msum$ME # lower bound on B sum
> msum$BuSum = msum$BSum + msum$ME # upper bound on B sum
> msum$PSum = plogis(msum$BSum) # predicted probability
> msum$PlSum = plogis(msum$BlSum) # lower bound on predicted probability
> msum$PuSum = plogis(msum$BuSum) # upper bound on predicted probability
>
> mvcov <- contrMat %*% vcov(m) %*% t(contrMat)
> mstderr <- sqrt(diag(mvcov))
> mlwr <- msums - 1.96*mstderr
> mupr <- msums + 1.96*mstderr
> mlwrpred <- plogis(mlwr)
> muprpred <- plogis(mupr)
>
> The easier way to do this (which isn't as generalizable
> to arbitrary contrasts, but works if all you want to do
> is predict values for each level):
>
> m2 <- update(m, . ~ . - 1) ## take out the intercept
> all.equal(unname(fixef(m2)),c(msums), tol=1e-5) ## TRUE
> (cc <- confint(m2,method="Wald"))
> all.equal(cbind(mlwr,mupr),unname(cc), tol=2e-5) ## TRUE
>
> The other thing to notice is that all of these confidence
> intervals still **do** overlap. Taking overlap of 95% confidence
intervals
> as indicating 95% difference is conservative: try searching for
> "95% confidence intervals overlap conservative" on Google scholar ...
>
> For what it's worth most of this question isn't GLMM-specific
> or even GLM-specific ...
>
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list