[R] Survival Analysis and Predict time-to-death

David Winsemius dwinsemius at comcast.net
Mon Aug 17 22:51:41 CEST 2015


On Aug 17, 2015, at 12:10 PM, survivalUser wrote:

> Dear All,
> 
> I would like to build a model, based on survival analysis on some data, that
> is able to predict the /*expected time until death*/ for a new data
> instance.

Are you sure you want to use life expectancy as the outcome? In order to establish a mathematical expectation  you need to have know the risk at all time in the future, which as pointed out in the print.survfit help page is undefined unless the last observation is a death. Very few datasets support such an estimate. If on the other hand you have sufficient events in the future, then you may be able to more readily justify an estimate of a median survival. 

The print.survfit function does give choices of a "restricted mean survival" or time-to-median-survival as estimate options. See that function's help page.

> Data
> For each individual in the population I have the, for each unit of time, the
> status information and several continuous covariates for that particular
> time. The data is right censored since at the end of the time interval
> analyzed, instances could be still alive and die later.
> 
> Model
> I created the model using R and the survreg function:
> 
> lfit <- survreg(Surv(time, status) ~ X) 
> 
> where:
> - time is the time vector
> - status is the status vector (0 alive, 1 death)
> - X is a bind of multiple vectors of covariates
> 
> Predict time to death
> Given a new individual with some covariates values, I would like to predict
> the estimated time to death. In other words, the number of time units for
> which the individual will be still alive till his death.
> 
> I think I can use this:
> 
> ptime <- predict(lfit, newdata=data.frame(X=NEWDATA), type='response')

I don't see type="response" as a documented option in the `?predict.survreg` help page. Were you suggesting that code on the basis of some tutorial?

> Is that correct? Am I going to get the expected-time-to-death that I would
> like to have?

Most people would be using `survfit` to construct survival estimates.

> 
> In theory, I could provide also the time information (the time when the
> individual has those covariates values), should I simply add that in the
> newdata:
> 
> ptime <- predict(lfit, newdata=data.frame(time=TIME, X=NEWDATA),
> type='response')
> 
> Is that correct?

This sounds like you are considering time-varying predictors. Adding them as a 'newdata' argument is most definitely not the correct method. As such I would ask if you really wanted to use a parametric survival model in the first place? The coxph function has facilities for time-varying covariates.


> Is this going to improve the prediction?

It would most likely severely complicate prediction. Survival estimates may be more problematic in that case on theoretical grounds.

> (for my data, the
> time already passed should be an important variable).
> 
> Any other suggestions or comments?
> 
> Thank you!
> 

R-help at r-project.org

The real Rhelp mailing list  ....   not the impostor Rhelp at Nabble

-- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list