[R] Logistic regression - confidence intervals

Frank E Harrell Jr f.harrell at vanderbilt.edu
Wed Feb 8 18:20:26 CET 2006

Cox, Stephen wrote:
> Please forgive a rather naïve question...
>
> Could someone please give a quick explanation for the differences in conf intervals achieved via confint.glm (based on profile liklihoods) and the intervals achieved using the Design library.
>
> For example, the intervals in the following two outputs are different.
>
> library(Design)
> x = rnorm(100)
> y = gl(2,50)
> d = data.frame(x = x, y = y)
> m1 = lrm(y~x, data = d)
> summary(m1)
>
> m2 = glm(y~x, family = binomial, data = d)
> confint(m2)
>
> I have spent time trying to figure this out via archives, but have not had much luck.
>
> Regards
>
> Stephen

Design uses Wald(large sample normality of parameter estimates) -based
confidence intervals.  These are good for most situations but profile
confidence intervals are preferred.   Someday I'll make Design do those.

One advantage to Wald statistics is that they extend readily to cluster
sampling (e.g., using cluster sandwich covariance estimators) and other
complications (e.g., adjustment of variances for multiple imputation),
whereas likelihood ratio statistics do not (unless e.g. you have an
explicit model for the correlation structure or other facits of the model).

Also note that confint is probably giving a confidence interval for a
one-unit change in x whereas summary.Design is computing an
interquartile-range effect (difference in x-values is shown in the
summary output).

When posting a nice simulated example it's best to do
set.seed(something) so everyone will get the same data.

Frank

--
Frank E Harrell Jr   Professor and Chair           School of Medicine
Department of Biostatistics   Vanderbilt University