bernhard.reinhardt at dlr.de
Fri Feb 27 12:14:54 CET 2009
it´s really great to receive some feedback from a "pro". I´m not sure if
I´ve got the point right:
You suppose that the cox-model isn´t good at forecasting an expected
survival time because of the issues with the prediction of the
survival-function at the right tail and one should better use parametric
models like an exponential model? Or what do you mean by "smooth
Anyways I just ordered your book at the library. Hopefully I´ll get some
more insights by the lecture of it.
Maybe I should point out why I even tried to do such forecasts.
Following the article "Quantifying climate-related risks and
uncertainties using Cox regression models" by Maia and Meinke I try to
deduce winter-precipitation from lagged Sea-Surface-Temperatures (SSTs).
So precipitation is my survival-time and and the SST-Observations at
different lags are my covariates.
The sample size is only 55 and I´ve got 11 covariates (Lag=0 months to
Lag=10 months) to choose from.
My first goal is to identify the optimal time-lag(s) between
SST-Anomaly-Observation and Precipitation-Observation.
Expectation was that the lag should be some months.
I thought a cox-model would easily provide such a selection. At first I
used the covariates individually. Coefficients for lags between 0 and 5
months were all quite big and then decreasing from 6 to 10 months. So I
think 5 months could be the lag of the process and high persistence of
the SST accounts for the big coefficients for 0-4 months.
As the next step I used all 11 covariates at once. I hoped to gain
similar results. Instead the sign of the coefficients "randomly" jumps
from plus to minus and the magnitude as well is randomly distributed.
I also tried to using sets of three covariates e.g. with lag 4,5,6. But
even then the sign of the coefficients is varying.
So my thought was that maybe I overfitted the model. But in fact I did
not find any literature if that´s even possible. As far as my limited
knowledge goes, overfitted models should reproduce the training-period
very good but other periods very poor. So I first tried to reproduce the
training-period. But so far with no success - as well with using 11
covariates or just 1.
Terry Therneau wrote:
> You are mostly correct.
> Because of the censoring issue, there is no good estimate of the mean survival
> time. The survival curve either does not go to zero, or gets very noisy near
> the right hand tail (large standard error); a smooth parametric estimate is what
> is really needed to deal with this.
> For this reason the mean survival, though computed (but see the
> survfit.print.mean option, help(print.survfit)) is not highly regarded. It is
> not an option in predict.coxph.
> Terry T.
> ----begin included message --------------
> if I got it right then the survival-time we expect for a subject is the
> integral over the specific survival-function of the subject from 0 to t_max.
> If I have a trained cox-model and want to make a prediction of the
> survival-time for a new subject I could use
> survfit(coxmodel, newdata=newSubject) to estimate a new
> survival-function which I have to integrate thereafter.
> Actually I thought predict(coxmodel, newSubject) would do this for me,
> but I?m confused which type I have to declare. If I understand the
> little pieces of documentation right then none of the available types is
> exactly the predicted survival-time.
> I think I have to use the mean survival-time of the baseline-function
> times exp(the result of type linear predictor).
> Am I right?
More information about the R-help