[R] Is there an equivalent to predict(..., type="linear") of a Proportional hazard model for a Cox model instead?

Thu Nov 25 19:23:25 CET 2010

I hit the send button on my second reply before I intended to. Since  
then I have noticed that the question I thought you were asking is not  
at all a good match to the Subject line of your message. There is a  
type ="lp" in predict.coxph and that is the linear predictor, although  
it is not a value that is on any transformation of time as would be  
the linear predictor of an accelerated failure time estimate. If you  
are looking for another method (in addition to predict.coxph) to  
produce expected survival from a coxph fit, you could also look at  
survexp. The ratetable argument accepts a fitted Cox model.

I have still not found a good answer to the question that I thought  
the body of your second posting was posing: namely why days and years  
are not handled the same in predict.survreg and predict.coxph.

-- 
David.
On Nov 25, 2010, at 12:35 PM, David Winsemius wrote:

>
> On Nov 25, 2010, at 10:08 AM, Ben Rhelp wrote:
>
>> Hi David,
>>
>> Thank you for your reply. See below for more information.
>>
>>
>>> From: David Winsemius
>>
>>>
>>> On Nov 25, 2010, at 7:27 AM, Ben Rhelp wrote:
>>>
>>>> I manage to  achieve similar results with a Cox model as follows  
>>>> but I don't
>>>> really  understand why we have to take the inverse of the linear  
>>>> prediction
>>> with
>>>> the Cox model
>>>
>>> Different parameterization. You can find expanded answer(s)  in  
>>> the archives
>>> and in the documentation of  survreg.distributions.
>>>
>>
>> I understand (i think) the difference in model structures between a  
>> Cox (coxph)
>> and Proportional hazard model (survreg).
>
> I couldn't tell whether this means you decided that those citations  
> answered your question. If not, then refer to Therneau's or Lumley's  
> replies in rhelp to a similar question earlier this month.:
>
> https://stat.ethz.ch/pipermail/r-help/2010-November/259796.html
> https://stat.ethz.ch/pipermail/r-help/2010-November/259747.html
>
>>
>>
>>>
>>>> and why we do not need to divide by the  number of days in the year
>>>> anymore?
>>>
>>> Here I'm guessing (since you  don't offer enough evidence to  
>>> confirm) that the
>>> difference is in the time  scales used in your Aidsp$survtime  
>>> versus some other
>>> example to which you are  comparing .
>>
>> Both models are run from the same data, so I am not expecting any  
>> differences in
>> time scales.
>>
>> To get similar results, I need to actually run the following  
>> equations:
>> expected_lifetime_in_years = exp(fit)/365.25                    --- 
>> > Linear
>> prediction of the Proportional hazard model
>> expected_lifetime_in_years = 1/exp(fit)                             
>> ---> Linear
>> prediction of the Cox model
>> where fit come from the linear prediction of each models,  
>> respectively.
>>
>> Actually, in the code below, I re-run the models and predictions  
>> based on a
>> yearly sampling time (instead of daily).
>> Again, to get similar results, I now need to actually run the  
>> following
>> equations:
>> expected_lifetime_in_years = exp(fit)                                
>> ---> Linear
>> prediction of the Proportional hazard model
>> expected_lifetime_in_years = 1/exp(fit)                             
>> ---> Linear
>> prediction of the Cox model
>>
>> I think I understand the logic behind the results of the  
>> proportional hazard
>> model, but not for the prediction of the Cox model.
>
> Cox models are PH models.
>
>>
>> Thank you for your help. I hope this is not a too stupid hole in my  
>> logic.
>>
>> Here is the self contained R code to produce the charts:
>>
>> library(MASS);
>> library(survival);
>>
>> #Same data but parametric fit
>> make.aidsp <- function(){
>>  cutoff <- 10043 # July 1987 in julian days
>>  btime <- pmin(cutoff, Aids2$death) - pmin(cutoff, Aids2$diag)
>>  atime <- pmax(cutoff, Aids2$death) - pmax(cutoff, Aids2$diag)
>>  survtime <- btime + 0.5*atime
>>  status <- as.numeric(Aids2$status)
>>  data.frame(survtime, status = status - 1, state = Aids2$state,
>>  T.categ = Aids2$T.categ, age = Aids2$age, sex = Aids2$sex)
>> }
>> Aidsp <- make.aidsp()
>>
>> # MASS example with the proportional hazard model
>> par(mfrow = c(1, 2));
>> (aids.ps <- survreg(Surv(survtime + 0.9, status) ~ state + T.categ +
>> pspline(age, df=6), data = Aidsp))
>> zz <- predict(aids.ps, data.frame(state = factor(rep("NSW", 83),  
>> levels =
>> levels(Aidsp$state)),
>>   T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)),  
>> age =
>> 0:82), se = T, type = "linear")
>> plot(0:82, exp(zz$fit)/365.25, type = "l", ylim = c(0, 2), xlab =  
>> "age", ylab =
>> "expected lifetime (years)")
>> lines(0:82, exp(zz$fit+1.96*zz$se.fit)/365.25, lty = 3, col = 2)
>> lines(0:82, exp(zz$fit-1.96*zz$se.fit)/365.25, lty = 3, col = 2)
>> rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize =  
>> 0.015)
>>
>>
>> # Same example but with a Cox model instead
>> (aids.pscp <- coxph(Surv(survtime + 0.9, status) ~ state + T.categ +
>> pspline(age, df=6), data = Aidsp))
>> zzcp <- predict(aids.pscp, data.frame(state = factor(rep("NSW",  
>> 83), levels =
>> levels(Aidsp$state)),
>>   T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)),  
>> age =
>> 0:82), se = T, type = "lp")
>> plot(0:82, 1/exp(zzcp$fit), type = "l", ylim = c(0, 2), xlab =  
>> "age", ylab =
>> "expected lifetime (years)")
>> lines(0:82, 1/exp(zzcp$fit+1.96*zzcp$se.fit), lty = 3, col = 2)
>> lines(0:82, 1/exp(zzcp$fit-1.96*zzcp$se.fit), lty = 3, col = 2)
>> rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize =  
>> 0.015)
>>
>>
>> # Change the sampling time from daily to yearly
>> par(mfrow = c(1, 1));
>> (aids.ps <- survreg(Surv((survtime + 0.9)/365.25, status) ~ state +  
>> T.categ +
>> pspline(age, df=6), data = Aidsp))
>> zz <- predict(aids.ps, data.frame(state = factor(rep("NSW", 83),  
>> levels =
>> levels(Aidsp$state)),
>>   T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)),  
>> age =
>> 0:82), se = T, type = "linear")
>> plot(0:82, exp(zz$fit), type = "l", ylim = c(0, 2), xlab = "age",  
>> ylab =
>> "expected lifetime (years)")
>>
>> (aids.pscp <- coxph(Surv((survtime + 0.9)/365.25, status) ~ state +  
>> T.categ +
>> pspline(age, df=6), data = Aidsp))
>> zzcp <- predict(aids.pscp, data.frame(state = factor(rep("NSW",  
>> 83), levels =
>> levels(Aidsp$state)),
>>   T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)),  
>> age =
>> 0:82), se = T, type = "lp")
>> lines(0:82, 1/exp(zzcp$fit),col=4)
>>
>>
>>
>>
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT