[R] weighted Cox proportional hazards regression
Scott Bartell
sbartell at uci.edu
Tue Dec 4 14:00:54 CET 2007
I'm getting unexpected results from the coxph function when using
weights from counter-matching. For example, the following code
produces a parameter estimate of -1.59 where I expect 0.63:
d2 = structure(list(x = c(1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
1, 0, 0, 1, 0, 1, 0, 1, 0, 1), wt = c(5, 42, 40, 4, 43, 4, 42,
4, 44, 5, 38, 4, 39, 4, 4, 37, 40, 4, 44, 5, 45, 5, 44, 5), riskset = c(1L,
1L, 4L, 4L, 6L, 6L, 12L, 12L, 13L, 13L, 19L, 19L, 23L, 23L, 31L,
31L, 42L, 42L, 45L, 45L, 70L, 70L, 93L, 93L), cc = c(1, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0
), pseudotime = rep(1,24)), .Names = c("x", "wt", "riskset",
"cc", "pseudotime"), class = "data.frame", row.names=1:24)
coxph( Surv(pseudotime, cc) ~ x + strata(riskset), weights=wt,
robust=T, method="breslow",data=d2)
I'm expecting a value of about 0.63 to 0.64 based on the data source
(simulated) and the following hand-coded MLE:
negloglik = function(beta,dat) {
dat$wexb = dat$wt * exp(dat$x * beta)
agged = aggregate(dat$wexb,list(riskset=dat$riskset),sum)
names(agged)[2] = "denom"
dat = merge(dat[dat$cc==1,],agged,by="riskset")
-sum(log(dat$wexb)-log(dat$denom))
}
nlm(negloglik,0,hessian=T,dat=d2)
Am I misunderstanding the meaning of case weights in the coxph
function? The help file is pretty terse.
Scott Bartell, PhD
Assistant Professor
Department of Epidemiology
University of California, Irvine
More information about the R-help
mailing list