[R] survival::survfit,plot.survfit

Tue Mar 3 15:22:00 CET 2009

---  begin included message ----
#Two models
coxsst4 <- coxph(Surv(schaeden)~ S5, data=nino4)
coxsst4_full  <- coxph(Surv(schaeden)~ 0+S1+S2+S3+S4+S5+S6+S7+S8+S9+S10, 
data=nino4)

#Set all covariates 0
attach(nino4)
newS4 <- data.frame(S0=0., S1=0., S2=0., S3=0., S4=0., S5=0., S6=0., 
S7=0., S8=0., S9=0., S10=0.)
detach()

new_surv1 <- survfit(coxsst4, newdata=newS4)
new_surv2 <- survfit(coxsst4_full, newdata=newS4)

Yields two different curves. What did I get wrong?

---- end inclusion ----------

  You did nothing wrong.  As I have said before, the survival curve from a Cox 
model is always for a particular hypothetical subject with a particular choice 
of covariates: there is nothing special (nil, nada, zip, NOTHING) about a 
covariate choice of zero.  There is no such thing as "the" baseline survival 
curve.

  1. Imagine someone sabotaged your data set by replacing S1 with S1+6.  None of 
the Cox model coefficients or inferences would change, but "0" is now someone 
quite different than before.

  2. Consider the linear models
  	fit1 <- lm(pat.karno ~ age, data=lung)
  	fit2 <- lm(pat.karno ~ age + sex, data=lung)
They have different predicted values for the hypothethical subject with 
age=sex=0.  (A subject with age=0 sex=0 is not particularly interesting of 
course, but then coxph survival curves for all covariates=0 are about the same.)
A baseline curve for all zeros is essentially an intercept term, and since it 
depends on what other covariates were or were not in the model is not useful on 
its own.

	Terry Therneau