[R] Confidence Intervals for logistic regression

Sat Aug 7 11:56:32 CEST 2010

>>>>> "PD" == Peter Dalgaard <pdalgd at gmail.com>
>>>>>     on Sat, 07 Aug 2010 10:37:49 +0200 writes:

    PD> Michael Bedward wrote:
    >>> I was aware of this option.  I was assuming it was not ok to do fit +/- 1.96
    >>> se when you requested probabilities.  If this is legitimate then all the
    >>> better.
    >> 
    >> I don't think it is.  I understood that you should do the calculation
    >> in the scale of the linear predictor and then transform to
    >> probabilities.
    >> 
    >> Happy to be corrected if that's wrong.

    PD> Probably, neither is optimal, although any transformed scale is
    PD> asymptotically equivalent. E.g., neither the probability scale nor the
    PD> logit scale stabilizes the variance of a simple proportion (the arcsine
    PD> transform does), so test-based CIs should really be asymmetric in both
    PD> cases rather than just +/- 1.96se.

    PD> However, working on the linear predictor scale has the advantage that
    PD> CIs by definition will not cross the boundaries of the parameter space.
    PD> (For the "usual" link functions: logit, probit, cloglog, that is; it's
    PD> not true for the identity link, obviously.)

I'm coming late to the thread, 
but it seems that nobody has yet given the advice which I would
very *strongly* suggest to anyone asking for confidence
intervals in GLMs:

Use  confint()
which (for "glm"s) will load the MASS package and use likelihood
- profiling, giving (much) more reliable confidence intervals,
notably in the case of the infamous Hauck-Donner phenomenon,
which has been mentioned many times "here", and notably in 
MASS (the book!), I think even in its first edition.

Even more reliable (probably) would be to use the (recommended)
'boot' package and use bootstrap confidence intervals, i.e.,
boot.ci() there.

Martin Maechler, ETH Zurich