[R] weighted Cox proportional hazards regression
    Scott Bartell 
    sbartell at uci.edu
       
    Tue Dec  4 14:00:54 CET 2007
    
    
  
I'm getting unexpected results from the coxph function when using
weights from counter-matching.  For example, the following code
produces a parameter estimate of -1.59 where I expect 0.63:
d2 = structure(list(x = c(1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
1, 0, 0, 1, 0, 1, 0, 1, 0, 1), wt = c(5, 42, 40, 4, 43, 4, 42,
4, 44, 5, 38, 4, 39, 4, 4, 37, 40, 4, 44, 5, 45, 5, 44, 5), riskset = c(1L,
1L, 4L, 4L, 6L, 6L, 12L, 12L, 13L, 13L, 19L, 19L, 23L, 23L, 31L,
31L, 42L, 42L, 45L, 45L, 70L, 70L, 93L, 93L), cc = c(1, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0
), pseudotime = rep(1,24)), .Names = c("x", "wt", "riskset",
"cc", "pseudotime"), class = "data.frame", row.names=1:24)
coxph( Surv(pseudotime, cc) ~ x + strata(riskset), weights=wt,
robust=T, method="breslow",data=d2)
I'm expecting a value of about 0.63 to 0.64 based on the data source
(simulated) and the following hand-coded MLE:
negloglik = function(beta,dat) {
  dat$wexb = dat$wt * exp(dat$x * beta)
  agged = aggregate(dat$wexb,list(riskset=dat$riskset),sum)
  names(agged)[2] = "denom"
  dat = merge(dat[dat$cc==1,],agged,by="riskset")
  -sum(log(dat$wexb)-log(dat$denom))
  }
nlm(negloglik,0,hessian=T,dat=d2)
Am I misunderstanding the meaning of case weights in the coxph
function?  The help file is pretty terse.
Scott Bartell, PhD
Assistant Professor
Department of Epidemiology
University of California, Irvine
    
    
More information about the R-help
mailing list