[R] Using coxph with Gompertz-distributed survival data.

Göran Broström goran.brostrom at gmail.com
Fri Feb 5 19:52:39 CET 2010

On Fri, Feb 5, 2010 at 5:48 PM, Terry Therneau <therneau at mayo.edu> wrote:
> Before being helpful let me raise a couple of questions:
>  1. "I know I'm looking at longevity data (which is believed to have a
> Gompertz distribution for mammals dying from 'old age')".
>   I'm not as convinced.  The Gompertz is a nice story, but is
> confounded by individual risk or 'frailty'.  But continue
>  2. "the mortality rate will be higher for 'a' younger ages, higher for
> 'b' at older ages, and the assumption of the Cox Proportional Hazards
> model is violated a priori, isn't it?"
>  That is correct.  So why exactly are you using coxph to fit the data.
>  3. "Yet I found plenty of Gompertz parameter values that differ, and
> lead to differences in survival times detectable by coxph, yet pass the
> cox.zph test. Should I assume that cox.zph is insufficiently
> sensitive..."
>  The Cox model fit a model with an average hazard ratio over time.  If
> the data satisfies the proportional hazards model, then this is all you
> need -- this single number tells you everything.  If the data does not,
> this does not mean that such an average hazard is invalid, it tells you
> that this average is not the whole story and coxph is an
> oversimplification.  I view this as similar to the fact that if a
> distribution is Gaussian then then (mean, var) is sufficient, everything
> that you ever wanted to know about the data (percentiles, outliers, ...)
> is summed up in those two values.  If it's not Gaussian it does not
> follow that the mean is worthless, but it isn't a complete story.
>  If you pick your parameters so that the change in hazard ratio is "not
> very large", of course cox.zph will not see it.  That's also the case
> where an overall average is probably a pretty good summary.
> 4: "coxph(Surv(age) ~ group + group:age)"
>  This is not how a change in hazard ratio over time is approached.  The
> program should give an error.  For one, why do you assume the change is
> linear in time?  This is rather rare.  You might look at the timedep
> package.
> 5. Some actual advice -- if you think it is Gompertzian why not fit a
> Gompertz distribution?
>  I don't see anything in CRAN to directly fit Gompertz,

See the functions 'aftreg' and 'phreg' in the package 'eha'.

Göran Broström

>  but the note
> below talks about how to do so approximately with survreg.  It's a note
> to myself of something to add to the survival package documentation, not
> yet done, and to my embarassment the file has a time stamp in 1996.  Ah
> well.
>                Terry Therneau
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Göran Broström

More information about the R-help mailing list