[R] center option of basehaz in survfit

Thomas Lumley tlumley at u.washington.edu
Mon Oct 1 17:30:33 CEST 2007


On Thu, 27 Sep 2007, David Koons wrote:

> I have a very general question about what the centering option in 
> basehaz does to factors.  (basehaz computes the baseline cumulative 
> hazard for a coxph object using the Breslow estimator).
>
> Lets say I'm interested in a survival model with two (dichotomous) factors and a continuous covariate.
> Variable               Possible Values
> Factor1                0 or 1
> Factor2                0 or 1
> Covariate            0 to 100
>
> I fit my model:
> modelname <- coxph(Surv ~ Factor1 + Factor2 + Covariate, data = data)
>
> If I then ask for:
> baselineA <- basehaz(modelname, centered=FALSE)
> I am fairly certain that baselineA will provide me with the cumulative 
> hazard evaluated at Factor1 = 0, Factor2 = 0, Covariate = 0.

Indeed

> Yet, if I ask for:
> baselineB <- basehaz(modelname, centered=TRUE)
> I know that baselineB will evaluate the cumulative hazard at Covariate = 
> 50

Only if 50 is the mean.

>, but am uncertain as to what it does with the factors.  I would not 
> think that the function would attempt to average a "factor"; however, I 
> cannot find any documentation to support my assumption.  To make sure, 
> does anyone know how basehaz (centered = TRUE/FALSE) handles models that 
> include both categorical factors and continuous covariates?

Yes, someone does (perhaps many people).

It averages the columns of the design matrix. This is not quite 'averaging 
factors' as the result depends on the choice of contrasts as well as on 
the coding of the factor.   In your example it isn't clear whether 
'Factor1' and 'Factor2' are defined as factors,  but with the default 
coding and default contrasts it doesn't affect the answer.

The mean is the default centering for survfit.coxph(), partly because it 
is the centering used internally to fit the Cox model.  The main point of 
basehaz() is to provide centered=FALSE for people who want it (or think 
they do).  You can get survival curves for any covariate values you like 
from survfit.coxph().

 	-thomas



More information about the R-help mailing list