[R-sig-ME] Probability CIs for Mixed Logistic Regression

Chris Howden chris at trickysolutions.com.au
Wed Dec 11 07:07:27 CET 2013


I often use 90% CI's since they give a closer approximation to a 5%
hypothesis test than 95% CI's. It's still conservative but not as much as
using 95% CI's, so there is still overlap when a 5% hypothesis test would
be rejected but not as much. I had a paper that suggested using 90% CI's
for this reasons somewhere but I can’t find it. The best I can do is the
following which gives a more general solution.

Harvey Goldstein; Michael J. R. Healy  (1995) The Graphical Presentation
of a Collection of Means. Journal of the Royal Statistical Society. Series
A (Statistics in Society), Vol. 158, No. 1. (1995), pp. 175-177.



Chris Howden B.Sc. (Hons) GStat.
Founding Partner
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(skype) chris.howden
chris at trickysolutions.com.au




Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are
not the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy, use or
disclose this communication or any attachments without our consent.
Although this email has been checked by anti-virus software, there is a
risk that email messages may be corrupted or infected by viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the company.
Tricky Solutions always does our best to provide accurate forecasts and
analyses based on the data supplied, however it is possible that some
important predictors were not included in the data sent to us. Information
provided by us should not be solely relied upon when making decisions and
clients should use their own judgement.

-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Egor
Ananyev
Sent: Wednesday, 11 December 2013 4:51 PM
To: Steven J. Pierce
Cc: r-sig-mixed-models
Subject: Re: [R-sig-ME] Probability CIs for Mixed Logistic Regression

Thank you for the recommended readings, Steven!

Cumming (2009) seems to be a proponent of the visual inspection with the
following rule for proportions (which is my case):
"Two independent proportions, 95 per cent CIs: For a comparison of two
independent proportions, two-tailed p </= 0.05 when Proportion Overlap is
about 0.5 or lessin other words the overlap of the 95 per cent CIs is no
more than about half the average arm length, meaning the average of the
two arms that overlap (Figure 3, left panel)."
You don't have to study cognitive psychology to see how this could be a
bit problematic.

But even that advice is only valid for groups more than 10. In my case,
the design is unbalanced, with two groups fewer than 10 (6 and 9). On an
practical level, I wonder if, given the discrepancy between the
probabilities and CIs, we should display 1.5*SE intervals on the estimated
probability plots. This is what Moses (1987) did here <
http://elearning.lsgi.org/GDM/Handouts/Graphical%20Methods%20in%20Statisti
cal%20Analysis.pdf>.
This approach allows easy detection of significant difference "by eye".

What's your opinion on this? I really appreciate all your help with this,
especially given that this is a more general statistics question.

Best,
--Egor


On 10 December 2013 21:42, Steven J. Pierce <pierces1 at msu.edu> wrote:

> Egor,
>
> These papers may give you a start on finding relevant work about how
> to interpret CI overlap.
>
> Cumming, G. (2009). Inference by eye: Reading the overlap of
> independent confidence intervals. Statistics in Medicine, 28(2),
205-220. doi:
> 10.1002/sim.3471
>
> Cumming, G. (2007). Inference by eye: Pictures of confidence intervals
> and thinking about levels of confidence. Teaching Statistics, 29, 89-93.
>
> Cumming, G., & Finch, S. (2005). Inference by eye: Confidence
> intervals and how to read pictures of data. American Psychologist,
60(2), 170-180. doi:
> 10.1037/0003-066X.60.2.170
>
>
> Steven J. Pierce, Ph.D.
> Associate Director
> Center for Statistical Training & Consulting (CSTAT) Michigan State
> University
> E-mail: pierces1 at msu.edu
> Web: http://www.cstat.msu.edu
>
> -----Original Message-----
> From: Egor Ananyev [mailto:egor.ananyev at gmail.com]
> Sent: Monday, December 09, 2013 8:32 PM
> To: Ben Bolker
> Cc: r-sig-mixed-models
> Subject: Re: [R-sig-ME] Probability CIs for Mixed Logistic Regression
>
> Hi Ben,
>
> Thanks a lot! I've since tried to do the same with predict(), but, as
> you noted, it doesn't include the function for estimating the standard
> error (< https://github.com/lme4/lme4/issues/147>). So I also tried to
> do this with a bootstrap method from ez package:
>
> # with ez package (bootstrap):
> library(ez)
> preds = ezPredict(m, boot = TRUE)
> ezPlot2(preds, x=vision)
>
> But the intervals still overlap: <
> https://dl.dropboxusercontent.com/u/9147994/ezPlot2.pdf>. I'll try to
> run the search and see what I can find about conservative CIs...
>
> Thanks again,
> --Egor
>
>
> On 10 December 2013 05:51, Ben Bolker <bbolker at gmail.com> wrote:
>
> > On 13-12-07 10:50 PM, Egor Ananyev wrote:
> > > Hello everyone,
> > >
> > > I have a question on how to calculate confidence intervals for
> predicted
> > > proportions from a mixed effects logistic model with glmer. It's
> probably
> > > very basic, but I'm at my wit's end. Below is a simplified
> > > reproducible example. I have a single three-level categorical
independent variable.
> > The
> > > method that I use (based on 1.96*SE) doesn't seem to work, because
> > > the confidence intervals for significantly different effects
overlap.
> > >
> > > Here's the output of the code below:
> > >              Est.   SE     z      P       BSum   ME     BlSum  BuSum
> > > PSum   PlSum  PuSum
> > > (Intercept)  2.21   0.60   3.69  <0.001   2.21   1.17   1.04   3.38
> > > 0.90   0.74   0.97
> > > vision1     -2.41   0.91  -2.66   0.008  -0.20   1.78  -1.98   1.58
> > > 0.45   0.12   0.83
> > > vision2     -1.11   0.97  -1.14   0.253   1.10   1.91  -0.81   3.00
> > > 0.75   0.31   0.95
> > >
> > > As you can see, (Intercept) and vision1 CIs overlap -- for both
> > cumulative
> > > (sum) B and the resulting proportions. I killed a few weekends to
> > > try
> to
> > > solve this problem and couldn't. Your help would be greatly
> appreciated.
> > >
> > > Thanks,
> > > --Egor
> > >
> >
> > # preparing the data set:
> > ## file URL: https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv
> > ## inputDir = 'C:/Dropbox/Computer/Eclipse/R/HT/_input/'
> > library(RCurl)
> > txt <-
> > getURL("https://dl.dropboxusercontent.com/u/9147994/ds_seen.csv")
> > ds = read.csv(textConnection(txt),header=TRUE)
> > ds$subject = as.factor(ds$subject)
> > ds$vision = as.factor(ds$vision)
> >
> > # running the model:
> > library(lme4)
> > m = glmer(seen ~ vision + (1|subject), data = ds, family =
> > 'binomial') ## better to use accessor methods if possible msum =
> > as.data.frame(coef(summary(m)))
> >
> > contrMat <- matrix(c(1,0,0,1,1,0,1,0,1),byrow=TRUE,ncol=3)
> > msums <- contrMat %*% fixef(m)
> > # calculating confidence intervals for the estimates:
> > msum$BSum[1] = msum$Estimate[1]
> > msum$BSum[2:3] = msum$Estimate[1] + msum$Estimate[2:3]
> > all.equal(msum$BSum,c(msums))  ## TRUE
> >
> > ## BMB:  I don't know why you expect these calculations to work; ##
> > the correct calculation is on the variance-covariance matrix msum$ME
> > = msum$`Std. Error` * 1.96 # margin of error msum$BlSum = msum$BSum
> > - msum$ME # lower bound on B sum msum$BuSum = msum$BSum + msum$ME #
> > upper bound on B sum msum$PSum = plogis(msum$BSum) # predicted
> > probability msum$PlSum = plogis(msum$BlSum) # lower bound on
> > predicted probability msum$PuSum = plogis(msum$BuSum) # upper bound
> > on predicted probability
> >
> > mvcov <- contrMat %*% vcov(m) %*% t(contrMat) mstderr <-
> > sqrt(diag(mvcov)) mlwr <- msums - 1.96*mstderr mupr <- msums +
> > 1.96*mstderr mlwrpred <- plogis(mlwr) muprpred <- plogis(mupr)
> >
> > The easier way to do this (which isn't as generalizable to arbitrary
> > contrasts, but works if all you want to do is predict values for
> > each level):
> >
> > m2  <- update(m, . ~ . - 1)  ## take out the intercept
> > all.equal(unname(fixef(m2)),c(msums), tol=1e-5)  ## TRUE (cc <-
> > confint(m2,method="Wald")) all.equal(cbind(mlwr,mupr),unname(cc),
> > tol=2e-5)  ## TRUE
> >
> > The other thing to notice is that all of these confidence intervals
> > still **do** overlap.  Taking overlap of 95% confidence
> intervals
> > as indicating 95% difference is conservative: try searching for "95%
> > confidence intervals overlap conservative" on Google scholar ...
> >
> > For what it's worth most of this question isn't GLMM-specific or
> > even GLM-specific ...
> >
>
>         [[alternative HTML version deleted]]
>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list