[R] Comparing COXPH models, one with age as a continuous variable, one with age as a three-level factor
Frank Harrell
f.harrell at vanderbilt.edu
Thu Sep 2 22:23:07 CEST 2010
On Thu, 2 Sep 2010, stephenb wrote:
>
> sorry to bump in late, but I am doing similar things now and was browsing.
>
> IMHO anova is not appropriate here. it applies when the richer model has p
> more variables than the simpler model. this is not the case here. the
> competing models use different variables.
A simple approach is to have the factor variable in the model and to
formally test for added information given by the continuous variable
(linear, quadratic, spline, etc). AIC could also be used.
>
> you are left with IC.
>
> by transforming a continuous variable into categorical you are smoothing,
> which is the idea of GAM. if you look at what is offered in GAMs you may
> find better approximations f(age) as well as tools for testing among
> different f(age) transformations.
I don't follow that comment. Smoothing uses the full continuous
variable to begin with.
A restricted cubic spline function in age is a simple approach. E.g.:
require(rms)
dd <- datadist(mydata); options(datadist='dd')
f <- cph(Surv(dtime,death) ~ rcs(age,4) + sex, data=mydata)
plot(Predict(f, age))
Note that you can always expect the categorized version of age not to
fit the data except sometimes when behavior is dictated by law
(driving, drinking, military service, medicare).
Frank
>
> regards.
> S.
More information about the R-help
mailing list