[R] cch function and time dependent covariates

Thomas Lumley tlumley at u.washington.edu
Sun Jun 15 17:40:51 CEST 2008


cch() does not allow for multiple records per person (as it says), and so doesn't allow for time-dependent covariates.

       -thomas

On Thu, 12 Jun 2008, Jin Wang wrote:

> same subject id has to be multiple in mutiple times like following format,
> Multiple records per id not allowed in cch()
> so it's difficult to use cch() for time dependent covariate. Maybe coxph()
> is alternative, but seems difficult because coxph() and cch() return
> different estimate for same data "nwtco" even without time-dependent
> covariates.
>
> id start end event
> 1    3    4
> 1    4    5
> 1    5    6    1
> 2    2    3
> 2    3    4
> 2    4    5
> 2    5    6    1
> I use the time-dependent covariates data "Rossi" in
> http://socserv.mcmaster.ca/jfox/Books/Companion/appendix-cox-regression.pdf
> I rebuild new case cohort data with time dependent variable based on Rossi
> data
> sc<-sample(c(TRUE,FALSE,FALSE,FALSE,FALSE,FALSE), 432, replace = TRUE)
> str(Rossi)
> Rossi1<-cbind(Rossi,sc)
> Rossi2<-cbind(seqno,Rossi1)
> subcoh1 <- Rossi2$sc
> selccoh1 <- with(Rossi2, arrest.time==1|subcoh1==1)
> ccoh1.data <- Rossi2[selccoh1,]
> ccoh1.data$subcohort <- subcoh1[selccoh1]
> str(ccoh1.data)
> ccoh1.data.fold <- fold(ccoh1.data, time='week',
> event='arrest', cov=12:63, cov.names='employed')
> str(ccoh1.data.fold)
> ccoh1.data.fold$sc<-as.logical(ccoh1.data.fold$sc)
> ccoh1.data.fold$subcohort<-as.logical(ccoh1.data.fold$subcohort)
> fit1.allison.2 <- cch(Surv(start, stop, arrest.time) ~
> fin + age + race + wexp + mar + paro + prio + employed,
> data=ccoh1.data.fold,subcoh=~subcohort,id=~seqno,cohort.size=19809)
> history(1000)
>
>> fit1.allison.2 <- cch(Surv(start, stop, arrest.time) ~
> +  fin + age + race + wexp + mar + paro + prio + employed,
> +  data=ccoh1.data.fold,subcoh=~subcohort,id=~seqno,cohort.size=19809)
> Error in cch(Surv(start, stop, arrest.time) ~ fin + age + race + wexp +  :
>        Multiple records per id not allowed
> =======================================================================
>
> 2008/6/12 Jin Wang <jinwang25 at gmail.com>:
>
>> I tried your alternative method on the example in cch() description manual.
>> The example data "nwtco" has not time-dependent covariates yet. I test cch()
>> and coxph() on the same data. But the estimation result is different. I
>> don't know if I did anything wrong.
>>
>> subcoh <- nwtco$in.subcohort
>> selccoh <- with(nwtco, rel==1|subcoh==1)
>> ccoh.data <- nwtco[selccoh,]
>> ccoh.data$subcohort <- subcoh[selccoh]
>> ## central-lab histology
>> ccoh.data$histol <- factor(ccoh.data$histol,labels=c("FH","UH"))
>> ## tumour stage
>> ccoh.data$stage <- factor(ccoh.data$stage,labels=c("I","II","III","IV"))
>> ccoh.data$age <- ccoh.data$age/12 # Age in years
>> fit.ccSP <- cch(Surv(edrel, rel) ~ stage + histol + age, data =ccoh.data,
>> subcoh = ~subcohort, id=~seqno, cohort.size=4028, method="SelfPren")
>> fit2.ccP <- coxph(Surv(edrel, rel) ~ stage + histol + age +
>> offset(-100*subcohort)+cluster(seqno),data =ccoh.data)
>>
>>
>>> fit2.ccP
>> Call:
>> coxph(formula = Surv(edrel, rel) ~ stage + histol + age + offset(-100 *
>>     subcohort) + cluster(seqno), data = ccoh.data)
>>
>>             coef exp(coef) se(coef) robust se      z      p
>> stageII  -0.1245     0.883   0.1236    0.1371 -0.908 0.3600
>> stageIII  0.0193     1.020   0.1252    0.1517  0.127 0.9000
>> stageIV   0.2997     1.350   0.1370    0.1509  1.986 0.0470
>> histolUH  0.3518     1.422   0.0920    0.1092  3.223 0.0013
>> age      -0.0281     0.972   0.0144    0.0168 -1.678 0.0930
>> Likelihood ratio test=34.5  on 5 df, p=1.89e-06  n= 1154
>>
>>
>>> summary(fit.ccSP)
>> Case-cohort analysis,x$method, SelfPrentice
>>  with subcohort of 668 from cohort of 4028
>>
>> Call: cch(formula = Surv(edrel, rel) ~ stage + histol + age, data =
>> ccoh.data,
>>     subcoh = ~subcohort, id = ~seqno, cohort.size = 4028, method =
>> "SelfPren")
>>
>> Coefficients:
>>           Coef    HR  (95%   CI)     p
>> stageII  0.736 2.088 1.491 2.925 0.000
>> stageIII 0.597 1.818 1.285 2.571 0.001
>> stageIV  1.392 4.021 2.670 6.057 0.000
>> histolUH 1.506 4.507 3.274 6.203 0.000
>> age      0.043 1.044 0.996 1.095 0.069
>>
>>
>> 2008/6/12 Terry Therneau <therneau at mayo.edu>:
>>
>> -----  begin included message
>>> In case cohort study, we can fit proportional hazard regression model to
>>> case-cohort data. In R, the function is cch() in Survival package
>>> Now I am working on case cohort analysis with time dependent covariates
>>> using cch() of "Survival" R package. I wonder if cch() provide this
>>> utility
>>> or not?
>>> The cch() manual does not say if time dependent covariate is allowed
>>> I know coxph() in Survival package can estimate time dependent covariates.
>>> ------ end inclusion -----------------------------------------------
>>>
>>>  The cch function was added to the package by Breslow and Lumley, neither
>>> of
>>> which appears to be monitoring the list lately.  Since it claims to
>>> impliment
>>> the methods in Li and Therneau, and I don't know the cch code, let me
>>> suggest an
>>> alternate way to create your fit:
>>>  Assume that your data set has the ususal coxph variables, including
>>> time-dependent covariates as multiple observations per subject using
>>> (start,
>>> stop) style, along with 2 other variables
>>>        id = a unique identifier per subject
>>>        case = 0 if the subject is a member of the random subcohort
>>>               1 if the subject is a case (an event from outside the
>>> subcohort)
>>>
>>> Then
>>>   coxph(Surv(time1, time2, status) ~ x1 + x2+ .... + offset(-100*case) +
>>>             cluster(id), data=mydata)
>>>
>>> Will fit the case-cohort model.  This correctly allows for time-dependent
>>> covariates.  It corresponds to the "Self" method of cch.
>>>  Why -100?  It causes the case to have a relative weight of approx 0 in a
>>> particular weighted mean; exp(-100) is small enough and doesn't cause
>>> trouble
>>> for the exp function.
>>>
>>>        Terry Therneau
>>>
>>>
>>>
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-help mailing list