[R] Survival Rate Estimates

Terry Therneau therneau at mayo.edu
Mon May 16 16:15:56 CEST 2011


>>  pfit2 <- pyears(time, status) ~ acut + sex, data=mydata)

>I think there was an omitted Surv function call above.
 Correct

>And a follow on question and a tentative answer: Can one generate a  
>'pyears' table that incorporates categories of time observed by
>tcut()- ting the time variable and having it in both the Surv argument
>and on the RHS of the formula? Experimentation makes me think this
>works very  well.

The tcut function is simply "cut" with a different class added.  In
fact, I just now reviewed the code and need to update it: it reprises an
old version of cut, due to a flaw in cut at the time, but that issue is
long since resolved.  I'll update it to call cut and add to the returned
class.  The pyears function looks for objects of class "tcut" and treats
them special: the value contained therein is the category one starts in,
but the category can change over time as you age.  The key "trick" with
tcut objects is that the follow-up time (left hand side of the formula)
and the tcut result must be in the same units for program to do this
properly.  

 As to the direct question: yes we often break up follow-up time.  A
simple example:
   > fugroup <- tcut(rep(0, nrow(lung)), c(0, 182, 365, 730, 100*365),
	labels=c("0-6m", "6m-1yr", "1-2yr", "2+ yr"))
   > fit <- pyears(Surv(time, status) ~ sex + fugroup, data=lung)
   > fit$event
  0-6m 6m-1yr 1-2yr 2+ yr
1   51     34    24     3
2   15     21    14     3

 In this advanced lung cancer data set, the majority of the deaths
happen within 1 year of study enrollment. 
  Don't make the common mistake of "fugroup <- tcut(lung$time, ...",
which would start each subject's time scale at the day of their last
follow-up.  

Terry T



More information about the R-help mailing list