[R-sig-eco] Confidence intervals for predicted probabilities in logistic regression

Tue Jun 23 22:39:37 CEST 2009

Hi Manuel,

Providing self contained code would make the question easier to
answer, but I'll have a go.  Comments below

On Tue, Jun 23, 2009 at 5:03 AM, Manuel Spínola<mspinola10 at gmail.com> wrote:
> Dear list members,
>
> I am fitting a logistic regression and I want to calculate the predicted
> probabilities of a response ("positive" and "negative") for a model with one
> categorical predictor (2 levels, "yes" and "no").
>
> oso.3 = glm(response ~ school, data=osologistic,
> family=binomial(link="logit"))
>
> For the reference level (the intercept, level "no") the point estimate is:,
>
>> plogis(coef(oso.3)[1] + coef(oso.3)[2]*0)
> (Intercept)
>     0.13
>
> To calculate the confidence interval (lower and upper limit) I am using the
> formula: beta0+beta1*x-1.96*se(beta0+beta1*x) and
> beta0+beta1*x+1.96*se(beta0+beta1*x), using the variance and covariance to
> calculate the standard error:
>

This describes an interval for a prediction on the logit scale

>> errore.no = sqrt(1.14 + 1.73*02 + 2*-1.14*0)
>

Why are you multiplying the 2nd term in the sum above by 2?  I'm
guessing that was not intended.

>> plogis(coef(oso.3)[1] + coef(oso.3)[2]*0-1.96*errore.no)
> (Intercept)
>    0.017
>
>> plogis(coef(oso.3)[1] + coef(oso.3)[2]*0+1.96*errore.no)
> (Intercept)
>     0.54
>

Here you seem to want an interval on the response scale.

> However, when I use the confint function I got:
>
>> plogis(confint(oso.3))
> Waiting for profiling to be done...
>           2.5 % 97.5 %
> (Intercept) 0.0076   0.45
> schoolyes   0.8025   1.00
>

These are profiled confidence intervals for the parameters on the
linear (logit) scale.

Here's a method of calculating asymptotic intervals for the
predictions on the response or logit scales:

set.seed(314)
n <- 100
x <- sample(0:1, n, repl=TRUE, prob=1:2)
linpred <- -1 + 4*x
response <- rbinom(n, 1, plogis(linpred))
school <- as.factor(c('yes','no')[x+1])
oso.3 <- glm(response ~ school, family='binomial')
summary(oso.3)

pred <- predict(oso.3, data.frame(school=c('yes','no')),
  se.fit=TRUE) # type = 'link' by default

with(pred, cbind(fit,
  low = fit - 1.96*se.fit,
  up = fit + 1.96*se.fit)
)

pred <- predict(oso.3, data.frame(school=c('yes','no')),
  type='response', se.fit=TRUE)

with(pred, cbind(fit,
  low = fit - 1.96*se.fit,
  up = fit + 1.96*se.fit)
)

hth,

Kingsford Jones

> Here, I am looking at only to the intercept that is the reference level,
> "no").
> I was expecting to get the same lower and upper limit for the intercept.
> Do you know why the results are different?
> Do I need the covariance to calculate the se?
> Thank you very much in advance.
> Best,
>
> Manuel
>
>
> --
> Manuel Spínola, Ph.D.
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspinola at una.ac.cr
> mspinola10 at gamil.com
> Teléfono: (506) 2277-3598
> Fax: (506) 2237-7036
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>