[R] Confidence Intervals for logistic regression
Martin Maechler
maechler at stat.math.ethz.ch
Sat Aug 7 11:56:32 CEST 2010
>>>>> "PD" == Peter Dalgaard <pdalgd at gmail.com>
>>>>> on Sat, 07 Aug 2010 10:37:49 +0200 writes:
PD> Michael Bedward wrote:
>>> I was aware of this option. I was assuming it was not ok to do fit +/- 1.96
>>> se when you requested probabilities. If this is legitimate then all the
>>> better.
>>
>> I don't think it is. I understood that you should do the calculation
>> in the scale of the linear predictor and then transform to
>> probabilities.
>>
>> Happy to be corrected if that's wrong.
PD> Probably, neither is optimal, although any transformed scale is
PD> asymptotically equivalent. E.g., neither the probability scale nor the
PD> logit scale stabilizes the variance of a simple proportion (the arcsine
PD> transform does), so test-based CIs should really be asymmetric in both
PD> cases rather than just +/- 1.96se.
PD> However, working on the linear predictor scale has the advantage that
PD> CIs by definition will not cross the boundaries of the parameter space.
PD> (For the "usual" link functions: logit, probit, cloglog, that is; it's
PD> not true for the identity link, obviously.)
I'm coming late to the thread,
but it seems that nobody has yet given the advice which I would
very *strongly* suggest to anyone asking for confidence
intervals in GLMs:
Use confint()
which (for "glm"s) will load the MASS package and use likelihood
- profiling, giving (much) more reliable confidence intervals,
notably in the case of the infamous Hauck-Donner phenomenon,
which has been mentioned many times "here", and notably in
MASS (the book!), I think even in its first edition.
Even more reliable (probably) would be to use the (recommended)
'boot' package and use bootstrap confidence intervals, i.e.,
boot.ci() there.
Martin Maechler, ETH Zurich
More information about the R-help
mailing list