[R] Survival analysis extrapolation

Terry Therneau therneau at mayo.edu
Thu May 27 14:41:31 CEST 2010

 A few suggestions

 1. Use the data argument
  wfit <- survreg(Surv(Time, Status) ~ Treatment, data=survdata)

 2. Use the predict function to look at your survival curve.  This
avoids any parameter change mistakes.  It also works for any

  pp <- 0:60/100
  wsurv <- predict(wfit, type='quantile', p=pp, 
  matplot(t(wsurv), 1-pp, xlab="time", ylab="survival", type='l')

Don't bother to ask for the 100th percentile -- it is infinity.  I chose
0-60 based on your questions.

(Why newdata=survdata[1:2,]?  The default for predict is to give a curve
for every observation in the original data.  You only need 2, one for a
subject on A and one for a subject on B, and your test data showed lines
1&2 of the data had one of each).

3. Overlay a Kaplan-Meier, to partially answer your third question.
   kmfit <- survfit(Surv(Time, Status) ~ Treatment, data=survdata)
   lines(kmfit, mark.time=F, col=1:2)

Terry Therneau

--- begin included message  --

I'm trying to fit a curve to some 1 year failure-time data, so that I
extrapolate and predict failure rates up to 3 years. The data is in the
general form:

Treatment	       Time	                      Status
Treatment A	       28	                          0
Treatment B	       28	                          0
Treatment B	       28	                          0
Treatment A	       28	                          1
Treatment B	       56	                          0
Treatment A	       56	                          0
Treatment B	       84	                          0
Treatment A	       84	                          0
Treatment A	       84	                          0

etc etc

Where time is in days and goes up to 360, 0 represents a withdrawal and
represents a failure. The code I've tried is as follows:




n<-seq(0, 1080, 30 )

##Compute Weibull CDF

A<-pweibull(n, scale=exp(WeiModel$coef[1]), shape=1/WeiModel$scale)


##Transform into survivor function



The problem is, that the Weibull curve is predicting a survival
for treatment B of around 11% at the end of year 3 (day 1080), and past
analyses have shown that this value should be somewhere in the region of
~40%. I'm finding a similar problem for treatment A also. I've also
fitting an exponential distribution but encountered the same problems.
questions are:

1)      Have I computed the survivor function correctly, in terms of
from the survreg output to the    pweibull parameters?

2)      If the above seems correct, then is there an obvious
reason why I'm getting survival probabilities far less than expected?
example, is the general method I'm using not the best approach? If so
you suggest a better one?)

3)      Is it worth trying to do the same thing with the other
that R uses for survival analysis (lognormal, gaussian etc). If so, how
you extrapolate the model out to 3 years? Are there functions that are
equivalent to the 'pweibull' function for all these distributions?

Any help you could give on this would be most appreciated.

Kind regards,


More information about the R-help mailing list