[R] Survfit with a coxph object

Terry Therneau therneau at mayo.edu
Tue Jan 2 17:29:32 CET 2007


> When I run coxph I get the coxph object back fairly quickly,
> however when I try to run survfit  it does not come back.

 If you are very, very patient the routine will come back eventually.  
Unfortunately, for some very large data sets this could be months...

   The reason is that the algorithms for coxph have been carefully optimized
over the years, but survfit is used so much less frequently that I have not
propogated many of these improvements forward to that routine.  In particular,
there is a computation which is O(d*n) if done in the obvious way, but O(2n)
when approached more cleverly; where d=number of events and n= number of
observations in the data set.  Your example has d ~ 50,000 and n~ 100,000, so
I would expect survfit.coxph to be roughly 20000 times slower than coxph.

  The long term solution is for me to fix this.  It's a couple of week's work,
if I can only find the weeks to do it.  The mid term one is to take Frank
Harrell's suggestion.  If standard errors are not needed, there is an O(n)
algorithm, which he has implemented as part of his additions to the coxph
suite.

	Terry Therneau



More information about the R-help mailing list