[R] Is there an equivalent to predict(..., type="linear") of a Proportional hazard model for a Cox model instead?

Ben Rhelp benrhelp at yahoo.co.uk
Thu Nov 25 16:08:20 CET 2010


Hi David,

Thank you for your reply. See below for more information.


> From: David Winsemius 
 
> 
> On Nov 25, 2010, at 7:27 AM, Ben Rhelp wrote:
> 
> > I manage to  achieve similar results with a Cox model as follows but I don't
> > really  understand why we have to take the inverse of the linear prediction 
>with
> >  the Cox model
> 
> Different parameterization. You can find expanded answer(s)  in the archives 
>and in the documentation of  survreg.distributions.
> 

I understand (i think) the difference in model structures between a Cox (coxph) 
and Proportional hazard model (survreg). 


> 
> > and why we do not need to divide by the  number of days in the year
> > anymore?
> 
> Here I'm guessing (since you  don't offer enough evidence to confirm) that the 
>difference is in the time  scales used in your Aidsp$survtime versus some other 
>example to which you are  comparing .

Both models are run from the same data, so I am not expecting any differences in 
time scales. 

To get similar results, I need to actually run the following equations:
expected_lifetime_in_years = exp(fit)/365.25                    ---> Linear 
prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit)                            ---> Linear 
prediction of the Cox model
where fit come from the linear prediction of each models, respectively.

Actually, in the code below, I re-run the models and predictions based on a 
yearly sampling time (instead of daily).
Again, to get similar results, I now need to actually run the following 
equations:
expected_lifetime_in_years = exp(fit)                               ---> Linear 
prediction of the Proportional hazard model
expected_lifetime_in_years = 1/exp(fit)                            ---> Linear 
prediction of the Cox model

I think I understand the logic behind the results of the proportional hazard 
model, but not for the prediction of the Cox model.

Thank you for your help. I hope this is not a too stupid hole in my logic.

Here is the self contained R code to produce the charts:

library(MASS);
library(survival);

#Same data but parametric fit
make.aidsp <- function(){
   cutoff <- 10043 # July 1987 in julian days
   btime <- pmin(cutoff, Aids2$death) - pmin(cutoff, Aids2$diag)
   atime <- pmax(cutoff, Aids2$death) - pmax(cutoff, Aids2$diag)
   survtime <- btime + 0.5*atime
   status <- as.numeric(Aids2$status)
   data.frame(survtime, status = status - 1, state = Aids2$state,
   T.categ = Aids2$T.categ, age = Aids2$age, sex = Aids2$sex)
}
Aidsp <- make.aidsp()

# MASS example with the proportional hazard model
par(mfrow = c(1, 2));
(aids.ps <- survreg(Surv(survtime + 0.9, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zz <- predict(aids.ps, data.frame(state = factor(rep("NSW", 83), levels = 
levels(Aidsp$state)),
    T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = "linear")
plot(0:82, exp(zz$fit)/365.25, type = "l", ylim = c(0, 2), xlab = "age", ylab = 
"expected lifetime (years)")
lines(0:82, exp(zz$fit+1.96*zz$se.fit)/365.25, lty = 3, col = 2)
lines(0:82, exp(zz$fit-1.96*zz$se.fit)/365.25, lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)


# Same example but with a Cox model instead
(aids.pscp <- coxph(Surv(survtime + 0.9, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zzcp <- predict(aids.pscp, data.frame(state = factor(rep("NSW", 83), levels = 
levels(Aidsp$state)),
    T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = "lp")
plot(0:82, 1/exp(zzcp$fit), type = "l", ylim = c(0, 2), xlab = "age", ylab = 
"expected lifetime (years)")
lines(0:82, 1/exp(zzcp$fit+1.96*zzcp$se.fit), lty = 3, col = 2)
lines(0:82, 1/exp(zzcp$fit-1.96*zzcp$se.fit), lty = 3, col = 2)
rug(Aidsp$age + runif(length(Aidsp$age), -0.5, 0.5), ticksize = 0.015)


# Change the sampling time from daily to yearly
par(mfrow = c(1, 1));
(aids.ps <- survreg(Surv((survtime + 0.9)/365.25, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zz <- predict(aids.ps, data.frame(state = factor(rep("NSW", 83), levels = 
levels(Aidsp$state)),
    T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = "linear")
plot(0:82, exp(zz$fit), type = "l", ylim = c(0, 2), xlab = "age", ylab = 
"expected lifetime (years)")

(aids.pscp <- coxph(Surv((survtime + 0.9)/365.25, status) ~ state + T.categ + 
pspline(age, df=6), data = Aidsp))
zzcp <- predict(aids.pscp, data.frame(state = factor(rep("NSW", 83), levels = 
levels(Aidsp$state)),
    T.categ = factor(rep("hs", 83), levels = levels(Aidsp$T.categ)), age = 
0:82), se = T, type = "lp")
lines(0:82, 1/exp(zzcp$fit),col=4)







More information about the R-help mailing list