[R] cch function and time dependent covariates
Thomas Lumley
tlumley at u.washington.edu
Sun Jun 15 17:40:51 CEST 2008
cch() does not allow for multiple records per person (as it says), and so doesn't allow for time-dependent covariates.
-thomas
On Thu, 12 Jun 2008, Jin Wang wrote:
> same subject id has to be multiple in mutiple times like following format,
> Multiple records per id not allowed in cch()
> so it's difficult to use cch() for time dependent covariate. Maybe coxph()
> is alternative, but seems difficult because coxph() and cch() return
> different estimate for same data "nwtco" even without time-dependent
> covariates.
>
> id start end event
> 1 3 4
> 1 4 5
> 1 5 6 1
> 2 2 3
> 2 3 4
> 2 4 5
> 2 5 6 1
> I use the time-dependent covariates data "Rossi" in
> http://socserv.mcmaster.ca/jfox/Books/Companion/appendix-cox-regression.pdf
> I rebuild new case cohort data with time dependent variable based on Rossi
> data
> sc<-sample(c(TRUE,FALSE,FALSE,FALSE,FALSE,FALSE), 432, replace = TRUE)
> str(Rossi)
> Rossi1<-cbind(Rossi,sc)
> Rossi2<-cbind(seqno,Rossi1)
> subcoh1 <- Rossi2$sc
> selccoh1 <- with(Rossi2, arrest.time==1|subcoh1==1)
> ccoh1.data <- Rossi2[selccoh1,]
> ccoh1.data$subcohort <- subcoh1[selccoh1]
> str(ccoh1.data)
> ccoh1.data.fold <- fold(ccoh1.data, time='week',
> event='arrest', cov=12:63, cov.names='employed')
> str(ccoh1.data.fold)
> ccoh1.data.fold$sc<-as.logical(ccoh1.data.fold$sc)
> ccoh1.data.fold$subcohort<-as.logical(ccoh1.data.fold$subcohort)
> fit1.allison.2 <- cch(Surv(start, stop, arrest.time) ~
> fin + age + race + wexp + mar + paro + prio + employed,
> data=ccoh1.data.fold,subcoh=~subcohort,id=~seqno,cohort.size=19809)
> history(1000)
>
>> fit1.allison.2 <- cch(Surv(start, stop, arrest.time) ~
> + fin + age + race + wexp + mar + paro + prio + employed,
> + data=ccoh1.data.fold,subcoh=~subcohort,id=~seqno,cohort.size=19809)
> Error in cch(Surv(start, stop, arrest.time) ~ fin + age + race + wexp + :
> Multiple records per id not allowed
> =======================================================================
>
> 2008/6/12 Jin Wang <jinwang25 at gmail.com>:
>
>> I tried your alternative method on the example in cch() description manual.
>> The example data "nwtco" has not time-dependent covariates yet. I test cch()
>> and coxph() on the same data. But the estimation result is different. I
>> don't know if I did anything wrong.
>>
>> subcoh <- nwtco$in.subcohort
>> selccoh <- with(nwtco, rel==1|subcoh==1)
>> ccoh.data <- nwtco[selccoh,]
>> ccoh.data$subcohort <- subcoh[selccoh]
>> ## central-lab histology
>> ccoh.data$histol <- factor(ccoh.data$histol,labels=c("FH","UH"))
>> ## tumour stage
>> ccoh.data$stage <- factor(ccoh.data$stage,labels=c("I","II","III","IV"))
>> ccoh.data$age <- ccoh.data$age/12 # Age in years
>> fit.ccSP <- cch(Surv(edrel, rel) ~ stage + histol + age, data =ccoh.data,
>> subcoh = ~subcohort, id=~seqno, cohort.size=4028, method="SelfPren")
>> fit2.ccP <- coxph(Surv(edrel, rel) ~ stage + histol + age +
>> offset(-100*subcohort)+cluster(seqno),data =ccoh.data)
>>
>>
>>> fit2.ccP
>> Call:
>> coxph(formula = Surv(edrel, rel) ~ stage + histol + age + offset(-100 *
>> subcohort) + cluster(seqno), data = ccoh.data)
>>
>> coef exp(coef) se(coef) robust se z p
>> stageII -0.1245 0.883 0.1236 0.1371 -0.908 0.3600
>> stageIII 0.0193 1.020 0.1252 0.1517 0.127 0.9000
>> stageIV 0.2997 1.350 0.1370 0.1509 1.986 0.0470
>> histolUH 0.3518 1.422 0.0920 0.1092 3.223 0.0013
>> age -0.0281 0.972 0.0144 0.0168 -1.678 0.0930
>> Likelihood ratio test=34.5 on 5 df, p=1.89e-06 n= 1154
>>
>>
>>> summary(fit.ccSP)
>> Case-cohort analysis,x$method, SelfPrentice
>> with subcohort of 668 from cohort of 4028
>>
>> Call: cch(formula = Surv(edrel, rel) ~ stage + histol + age, data =
>> ccoh.data,
>> subcoh = ~subcohort, id = ~seqno, cohort.size = 4028, method =
>> "SelfPren")
>>
>> Coefficients:
>> Coef HR (95% CI) p
>> stageII 0.736 2.088 1.491 2.925 0.000
>> stageIII 0.597 1.818 1.285 2.571 0.001
>> stageIV 1.392 4.021 2.670 6.057 0.000
>> histolUH 1.506 4.507 3.274 6.203 0.000
>> age 0.043 1.044 0.996 1.095 0.069
>>
>>
>> 2008/6/12 Terry Therneau <therneau at mayo.edu>:
>>
>> ----- begin included message
>>> In case cohort study, we can fit proportional hazard regression model to
>>> case-cohort data. In R, the function is cch() in Survival package
>>> Now I am working on case cohort analysis with time dependent covariates
>>> using cch() of "Survival" R package. I wonder if cch() provide this
>>> utility
>>> or not?
>>> The cch() manual does not say if time dependent covariate is allowed
>>> I know coxph() in Survival package can estimate time dependent covariates.
>>> ------ end inclusion -----------------------------------------------
>>>
>>> The cch function was added to the package by Breslow and Lumley, neither
>>> of
>>> which appears to be monitoring the list lately. Since it claims to
>>> impliment
>>> the methods in Li and Therneau, and I don't know the cch code, let me
>>> suggest an
>>> alternate way to create your fit:
>>> Assume that your data set has the ususal coxph variables, including
>>> time-dependent covariates as multiple observations per subject using
>>> (start,
>>> stop) style, along with 2 other variables
>>> id = a unique identifier per subject
>>> case = 0 if the subject is a member of the random subcohort
>>> 1 if the subject is a case (an event from outside the
>>> subcohort)
>>>
>>> Then
>>> coxph(Surv(time1, time2, status) ~ x1 + x2+ .... + offset(-100*case) +
>>> cluster(id), data=mydata)
>>>
>>> Will fit the case-cohort model. This correctly allows for time-dependent
>>> covariates. It corresponds to the "Self" method of cch.
>>> Why -100? It causes the case to have a relative weight of approx 0 in a
>>> particular weighted mean; exp(-100) is small enough and doesn't cause
>>> trouble
>>> for the exp function.
>>>
>>> Terry Therneau
>>>
>>>
>>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-help
mailing list