[R] Calculating/understanding variance-covariance matrix of logistic regression (lrm $var)

Thu Jan 29 05:00:41 CET 2004

On Thu, 29 Jan 2004 02:34:27 +0100 (CET)
Karl Knoblick <karlknoblich at yahoo.de> wrote:

> Hallo!
> 
> I want to understand / recalculate what is done to get
> the CI of the logistic regression evaluated with lrm.
> As far as I came back, my problem is the
> variance-covariance matrix fit$var of the fit
> (fit<-lrm(...), fit$var). Here what I found and where
> I stucked:
> 
> -----------------
> library(Design)
> # data
> D<-c(rep("a", 20), rep("b", 20))
> V<-0.25*(1:40)
> V[1]<-25
> V[40]<-15
> data<-data.frame(D, V)
> d<-datadist(data)
> options(datadist="d")
> 
> # Fit
> fit<-lrm(D ~ V, data=data, x=TRUE, se.fit=TRUE)
> plot(fit, conf.int=0.95) # same as plot(fit)
> 
> # calculation of upper and lower CI (pred$lower,
> pred$upper)
> pred<-predict(fit, data.frame(V=V), conf.int=0.95,
> se.fit=TRUE)
> points(V, pred$upper, col=2, pch=3) # to check
> 
> # looking in function predict, the CI are calculated
> with the se
> # using fit$var:
> X<-cbind(rep(1, length(fit$x)), fit$x) # fit$x are the
> V
> cov<-fit$var # <- THIS I DO NOT UNDERSTAND (***) s.
> below
> se <- drop(sqrt(((X %*% cov) * X) %*% rep(1,
> ncol(X))))
> 
> # check if it is the same
> min(se - pred$se.fit) # result: 0
> max(se - pred$se.fit) # result: 0
> 
> # looking at the problem:
> cov
> -----------------
> Result:
>            Intercept           V
> Intercept  0.7759040 -0.12038969
> V         -0.1203897  0.02274177
> 
> 
> (***)
> fit$var is the estimated variance-covariance matrix.
> How is it calculated? (Meaning of intercept and x?)
> 
> Does anybody know how calculationg this "by hand" or
> can give me a reference (preferable in the internet)?
> 
> Thanks!
> Karl.

Karl:

I'm not clear why you quoted the other code as your entire question is
about the basic quantity fit$var.  fit$var is the inverse of the observed
information matrix at the final regression coefficient estimates.  This is
a very standard approach and is detailed in most books on logistic
regression or glms.  It is related to the Newton-Raphson iterative
algorithm for maximizing the likelihood.  The information matrix is like
the sums of squares and cross-product matrix in ordinary regression except
for a weigh of the form P*(1-P) where P is a row's estimated probability
of even from the final iteration.

---
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University