[R] Calculating/understanding variance-covariance matrix of logistic regression (lrm $var)
Frank E Harrell Jr
feh3k at spamcop.net
Thu Jan 29 05:00:41 CET 2004
On Thu, 29 Jan 2004 02:34:27 +0100 (CET)
Karl Knoblick <karlknoblich at yahoo.de> wrote:
> Hallo!
>
> I want to understand / recalculate what is done to get
> the CI of the logistic regression evaluated with lrm.
> As far as I came back, my problem is the
> variance-covariance matrix fit$var of the fit
> (fit<-lrm(...), fit$var). Here what I found and where
> I stucked:
>
> -----------------
> library(Design)
> # data
> D<-c(rep("a", 20), rep("b", 20))
> V<-0.25*(1:40)
> V[1]<-25
> V[40]<-15
> data<-data.frame(D, V)
> d<-datadist(data)
> options(datadist="d")
>
> # Fit
> fit<-lrm(D ~ V, data=data, x=TRUE, se.fit=TRUE)
> plot(fit, conf.int=0.95) # same as plot(fit)
>
> # calculation of upper and lower CI (pred$lower,
> pred$upper)
> pred<-predict(fit, data.frame(V=V), conf.int=0.95,
> se.fit=TRUE)
> points(V, pred$upper, col=2, pch=3) # to check
>
> # looking in function predict, the CI are calculated
> with the se
> # using fit$var:
> X<-cbind(rep(1, length(fit$x)), fit$x) # fit$x are the
> V
> cov<-fit$var # <- THIS I DO NOT UNDERSTAND (***) s.
> below
> se <- drop(sqrt(((X %*% cov) * X) %*% rep(1,
> ncol(X))))
>
> # check if it is the same
> min(se - pred$se.fit) # result: 0
> max(se - pred$se.fit) # result: 0
>
> # looking at the problem:
> cov
> -----------------
> Result:
> Intercept V
> Intercept 0.7759040 -0.12038969
> V -0.1203897 0.02274177
>
>
> (***)
> fit$var is the estimated variance-covariance matrix.
> How is it calculated? (Meaning of intercept and x?)
>
> Does anybody know how calculationg this "by hand" or
> can give me a reference (preferable in the internet)?
>
> Thanks!
> Karl.
Karl:
I'm not clear why you quoted the other code as your entire question is
about the basic quantity fit$var. fit$var is the inverse of the observed
information matrix at the final regression coefficient estimates. This is
a very standard approach and is detailed in most books on logistic
regression or glms. It is related to the Newton-Raphson iterative
algorithm for maximizing the likelihood. The information matrix is like
the sums of squares and cross-product matrix in ordinary regression except
for a weigh of the form P*(1-P) where P is a row's estimated probability
of even from the final iteration.
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list