[R] Modelling survival with time-dependent covariates

Terry Therneau therneau at mayo.edu
Fri Jul 2 15:25:21 CEST 2010


 1-Would informing the algorithm coxph which samples represents the same
person (through the use of an Id for example) improve the ?efficiency?
of the estimated model? And if so, how should i do that? Using strata()?

 No, it makes no change. The reason is that the (start, stop] is just a
trick.  At each death time the program needs to figure out what the
covariates are for everyone else at that time; the start,stop lets it
pick the right line for each subject.  As long as there are no overlaps,
i.e. (0,20], (15, 50], then there is only one copy of the person, and no
'correlated data' issue.  (Overlap is wierd -- it corresponds to two
copies of me being in the room at the same time.)
 If there are multiple events for a subject, then there is correlation
(via a different mechanism), and addition of a cluster() term is needed.

2- He later suggests ?accommodating non-proportional hazards by building
interactions between covariates and time into the Cox regression model?
as follows:
 
 coxph(Surv(start, stop, arrest.time) ~fin + age + age:stop + prio, ...

This trick ONLY works if 
  a. the data set has been artificially divided (as your example has)
into small uniform time increments, the same for each subject.
  b. the form of the non-ph is actally a linear change in beta over
time.  Use cox.zph on the original model to look at this.  When I see
non-ph (the plot from cox.zph is not horizontal) life is rarely so
simple.

Terry Therneau



More information about the R-help mailing list