[R] Confidence Intervals for logistic regression

Martin Maechler maechler at stat.math.ethz.ch
Sat Aug 7 11:56:32 CEST 2010


>>>>> "PD" == Peter Dalgaard <pdalgd at gmail.com>
>>>>>     on Sat, 07 Aug 2010 10:37:49 +0200 writes:

    PD> Michael Bedward wrote:
    >>> I was aware of this option.  I was assuming it was not ok to do fit +/- 1.96
    >>> se when you requested probabilities.  If this is legitimate then all the
    >>> better.
    >> 
    >> I don't think it is.  I understood that you should do the calculation
    >> in the scale of the linear predictor and then transform to
    >> probabilities.
    >> 
    >> Happy to be corrected if that's wrong.

    PD> Probably, neither is optimal, although any transformed scale is
    PD> asymptotically equivalent. E.g., neither the probability scale nor the
    PD> logit scale stabilizes the variance of a simple proportion (the arcsine
    PD> transform does), so test-based CIs should really be asymmetric in both
    PD> cases rather than just +/- 1.96se.

    PD> However, working on the linear predictor scale has the advantage that
    PD> CIs by definition will not cross the boundaries of the parameter space.
    PD> (For the "usual" link functions: logit, probit, cloglog, that is; it's
    PD> not true for the identity link, obviously.)

I'm coming late to the thread, 
but it seems that nobody has yet given the advice which I would
very *strongly* suggest to anyone asking for confidence
intervals in GLMs:

Use  confint()
which (for "glm"s) will load the MASS package and use likelihood
- profiling, giving (much) more reliable confidence intervals,
notably in the case of the infamous Hauck-Donner phenomenon,
which has been mentioned many times "here", and notably in 
MASS (the book!), I think even in its first edition.

Even more reliable (probably) would be to use the (recommended)
'boot' package and use bootstrap confidence intervals, i.e.,
boot.ci() there.

Martin Maechler, ETH Zurich



More information about the R-help mailing list