[R] Predictions from "coxph" or "cph" objects
Göran Broström
goran.brostrom at umu.se
Sun Jul 6 11:17:10 CEST 2014
On 2014-07-06 10:48, Göran Broström wrote:
> David and Axel,
>
> I have two comments to your discussion:
>
> (i) The area under the survival curve is equal to the mean of the
> distribution, so the estimate of the mean should be the sum of the areas
> of the rectangles defined by the estimated survival curve and the
> successive distances between observed event times.
>
> Thus,
>
> > surv <- pred$surv
> > time <- pred$time
> > sum(surv * diff(time))
>
> should give you the (estimated) mean). (Note that time[1] == 0, and
> length(time) == length(surv) + 1)
Well, this is not quite true; on the first interval the survival curve
is one, so you need to
> surv <- c(1, surv)
first. But then the lengths of the surv and time vectors do not match so
you need to add a (large) time at the end of time. If the largest
observation is an event, 'no problem' (surv is zero), but otherwise ...
Btw, I tried
> exit <- rexp(10)
> event <- rep(1, 10)
> fit <- coxph(Surv(exit, event) ~ 1)
> survfit(fit)$surv
[1] 0.90483742 0.80968410 0.71454371 0.61942215 0.52432953 0.42928471
[7] 0.33432727 0.23955596 0.14529803 0.05345216
> survfit(Surv(exit, event) ~ 1)$surv
[1] 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
so be careful ...
Göran
>
> (I do not think that David's suggestion gives the same answer, but I may
> be wrong.)
>
> (ii) With censored data, this may be a bad idea. For instance, when the
> largest observation is a censoring time, you may badly underestimate the
> mean. Your best hope is to be able to estimate a conditional mean of the
> type E(T | T < x).
>
> This is essentially a non-parametric situation, and therefore it is
> better to stick to medians and quantiles.
>
> Göran Broström
>
> On 2014-07-06 06:17, David Winsemius wrote:
>>
>> On Jul 5, 2014, at 9:12 PM, David Winsemius wrote:
>>
>>>
>>> On Jul 5, 2014, at 12:43 PM, Axel Urbiz wrote:
>>>
>>>> Thank you David. It is my understanding that using survfirsurvit
>>>> below I get the median predicted survival. I actually was looking
>>>> for the mean. I can't seem to find in the documentation how to get
>>>> that.
>>>>
>>>> options(na.action=na.exclude) # retain NA in predictions
>>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>>> pred <- survfit(fit, newdata=lung)
>>>> head(pred)
>>>>
>>> There might be a way. I don't know it if so, so I would probably
>>> just use the definition of the mean:
>>>
>>> sum(summary(pred)$surv* summary(pred)$time)/sum( summary(pred)$time)
>>>
>>
>> Er, I think I meant to type:
>>
>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>> pred <- survfit(fit)
>>
>> sum(summary(pred)$surv* summary(pred)$time)/sum( summary(pred)$surv)
>> [1] 211.0943
>>
>>
>>> (I continue to take effort to keep my postings in plain text despite
>>> my mail-clients's efforts to match your formatted postings. It adds
>>> to the work of responders when you post formatted questions and
>>> responses.)
>>>
>>>
>>>> Thanks again,
>>>> Axel.
>>>>
>>>>
>>>>
>>>> On Sat, Jul 5, 2014 at 1:54 PM, David Winsemius <dwinsemius at comcast.net
>>>>> wrote:
>>>>
>>>> On Jul 5, 2014, at 5:28 AM, Axel Urbiz wrote:
>>>>
>>>> Dear R users,
>>>>
>>>> My apologies for the simple question, as I'm starting to learn the
>>>> concepts
>>>> behind the Cox PH model. I was just experimenting with the survival
>>>> and rms
>>>> packages for this.
>>>>
>>>> I'm simply trying to obtain the expected survival time (as opposed
>>>> to the
>>>> probability of survival at a given time t).
>>>>
>>>> What does "expected survival time" actually mean? Do you want the
>>>> median survival time?
>>>>
>>>>
>>>> I can't seem to find an option
>>>> from the "type" argument in the predict methods from
>>>> coxph{survival} or
>>>> cph{rms} that will give me expected survival times.
>>>>
>>>> library(rms)
>>>> options(na.action=na.exclude) # retain NA in predictions
>>>> fit <- coxph(Surv(time, status) ~ age + ph.ecog, lung)
>>>> fit2 <- cph(Surv(time, status) ~ age + ph.ecog, lung)
>>>> head(predict(fit,type="lp"))
>>>> head(predict(fit2,type="lp"))
>>>>
>>>> `predict` will return the results of the regression, i.e. the log-
>>>> hazard ratios for each term in the RHS of the formula. What you
>>>> want (as described in the Index for the survival package) is either
>>>> `survfit` or `survexp`.
>>>>
>>>> require(survival)
>>>> help(pack=survival)
>>>> ?survfit
>>>> ?survexp
>>>> ?summary.survfit
>>>> ?quantile.survfit # to get the median
>>>> ?print.summary.survfit
>>>>
>>>> require(rms)
>>>> help(pack=rms)
>>>>
>>>> The rms-package also adds a `survfit.cph` function but I have found
>>>> the `survest` function also provides useful added features, beyond
>>>> those offered by survfit
>>>>
>>>>
>>>>
>>>> Thank you.
>>>>
>>>> Regards,
>>>> Axel.
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> This is a plain text mailing list.
>>>>
>>>> --
>>>>
>>>> David Winsemius, MD
>>>> Alameda, CA, USA
>>>>
>>>>
>>>
>>> David Winsemius, MD
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> Alameda, CA, USA
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
More information about the R-help
mailing list