[R] Obtaining value of median survival for survfit function to use in calculation

Marc Schwartz marc_schwartz at me.com
Wed Sep 9 04:27:45 CEST 2009


Hi,

I have a strong recollection of a relatively recent communication on  
this issue with Terry, but cannot seem to locate it at the moment and  
I don't recall if it was on-list or off-list. I am either on drugs or  
Terry indicated that he was going to modify either the returned  
survfit() or summary.survfit() object to include the medians and CI's  
as distinct elements, rather than or in addition to this being a part  
of print.survfit().

In either case, the evolving approach that I have taken over the years  
has been to use capture.output() to get the results of using  
print.survfit() on the survfit() object and then parsing the results  
to get the medians and CIs:


# Get the last 5 rows of the output (number of groups + 1)
Tmp <- tail(capture.output(print(lung.byPS)), 5)

 > Tmp
[1] "          records n.max n.start events median 0.95LCL 0.95UCL"
[2] "ph.ecog=0      63    63      63     37    394     348     574"
[3] "ph.ecog=1     113   113     113     82    306     268     429"
[4] "ph.ecog=2      50    50      50     44    199     156     288"
[5] "ph.ecog=3       1     1       1      1    118      NA      NA"


# Get the 4 data rows and header row into a data frame
Res <- read.table(z <- textConnection(Tmp), header = TRUE)


# Close z
close(z)


# Voilà
 > Res
           records n.max n.start events median X0.95LCL X0.95UCL
ph.ecog=0      63    63      63     37    394      348      574
ph.ecog=1     113   113     113     82    306      268      429
ph.ecog=2      50    50      50     44    199      156      288
ph.ecog=3       1     1       1      1    118       NA       NA


 > str(Res)
'data.frame':	4 obs. of  7 variables:
  $ records : int  63 113 50 1
  $ n.max   : int  63 113 50 1
  $ n.start : int  63 113 50 1
  $ events  : int  37 82 44 1
  $ median  : int  394 306 199 118
  $ X0.95LCL: int  348 268 156 NA
  $ X0.95UCL: int  574 429 288 NA


Now you can get the medians and CI's using:

 > Res$median
[1] 394 306 199 118

 > Res$X0.95LCL
[1] 348 268 156  NA

 > Res$X0.95UCL
[1] 574 429 288  NA



Aw shoot.....I am not on drugs....I just found it. It was  
summary.survfit() and it was May of this year:

  http://tolstoy.newcastle.edu.au/R/e6/help/09/05/15413.html

Arguably, it is a bit buried in the details of the Value section of ? 
summary.survfit and since it is not print()ed, it is not immediately  
evident.

Thus, all you need to do is:

 > summary(lung.byPS)$table
           records n.max n.start events median 0.95LCL 0.95UCL
ph.ecog=0      63    63      63     37    394     348     574
ph.ecog=1     113   113     113     82    306     268     429
ph.ecog=2      50    50      50     44    199     156     288
ph.ecog=3       1     1       1      1    118      NA      NA


 > summary(lung.byPS)$table[, "median"]
ph.ecog=0 ph.ecog=1 ph.ecog=2 ph.ecog=3
       394       306       199       118

 > summary(lung.byPS)$table[, "0.95LCL"]
ph.ecog=0 ph.ecog=1 ph.ecog=2 ph.ecog=3
       348       268       156        NA

 > summary(lung.byPS)$table[, "0.95UCL"]
ph.ecog=0 ph.ecog=1 ph.ecog=2 ph.ecog=3
       574       429       288        NA


Now I can sleep better...and hopefully remember this for the next  
time...  :-)

HTH,

Marc Schwartz



On Sep 8, 2009, at 6:51 PM, David Winsemius wrote:

> The survfit.object help page says:
>
> "The print.survfit method does more computation than is typical for  
> a print method and is documented on a separate page."
>
> It takes a bit of digging, but after first trying:
>
> getAnywhere(print.survfit)   # and then following code and trying
> getAnywhere(survmean)  # survmean is the function which does the  
> work using pfun
>
> ... I think I finally understand exactly what the print.survfit help  
> page means when it refers to a "side-effect".
>
> So there is no durable element in the survfit object that is the  
> median. The automation you request would involve duplicating the  
> code of survmean but with assignment of the "out" matrix to  
> something that does not get discarded.
>
> -- 
> David Winsemius
>
> On Sep 8, 2009, at 6:42 PM, Polwart Calum (County Durham and  
> Darlington NHS Foundation Trust) wrote:
>
>> Hi,
>>
>> I'm sure this should be simple but I can't figure it out!  I want  
>> to get the median survival calculated by the survfit function and  
>> use the value rather than just be able to print it.  Something like  
>> this:
>>
>> library(survival)
>> data(lung)
>> lung.byPS = survfit(Surv (time, status) ~ ph.ecog, data=lung)
>> # lung.byPS
>> Call: survfit(formula = Surv(time, status) ~ ph.ecog, data = lung)
>> 1 observation deleted due to missingness
>>             n events median 0.95LCL 0.95UCL
>> ph.ecog=0  63     37    394     348     574
>> ph.ecog=1 113     82    306     268     429
>> ph.ecog=2  50     44    199     156     288
>> ph.ecog=3   1      1    118     Inf     Inf
>>
>> What I want is to be able to call the median using something like:
>>
>> lung.byPS$median[ph.ecog=0]
>>
>> so that I can add it to a plot like this:
>>               plot (lung.byPS,
>>                               conf.int=F,
>>                               lty=1:4,
>>               )
>>               abline(h=0.5, lty=5)
>>               abline(v=lung.byPS$median[ph.ecog=1])
>>               abline(v=lung.byPS$median[ph.ecog=2])
>>
>> Anyone got any easy solutions?  Its fairly normal to plot across  
>> and down to show medians on survival curves... so I'm sure it must  
>> be possible to automate.
>>




More information about the R-help mailing list