[R] survreg & gompertz

Matthias Ziehm matthias.ziehm at gmail.com
Fri Nov 16 12:35:44 CET 2012


On 15/11/12 21:22, David Winsemius wrote:
> On Nov 15, 2012, at 5:38 AM, Matthias Ziehm wrote:
>
>> Hi all,
>>
>> Sorry if this has been answered already, but I couldn't find it in the archives or general internet.
>
> A Markmail/Rhelp search on:  gompertz survreg  ...brings this link to a reply by Terry Therneau. Seems to address everything you asked and a bit more
>
> http://markmail.org/search/?q=list%3Aorg.r-project.r-help+gompertz+survreg#query:list%3Aorg.r-project.r-help%20gompertz%20survreg+page:1+mid:6xdlsmo272oa7zkw+state:results
>
> (Depending on how your mailer breaks URLs you may need to paste it back together.)

Thanks! I hadn't read this thread because of the missleading title. 
However, now I've got follow up questions, on that explanation:

On Feb 5, 2010 at 8:48:08 am, Terry Therneau wrote:
>>> Subject: Re: [R] Using coxph with Gompertz-distributed survival data.
>>> ...
>>> the note below talks about how to do so approximately with survreg.
>>> It's a note to myself of something to add to the survival package
>>> documentation, not yet done, and to my embarassment the file has a
>>> time stamp in 1996. Ah well.
>>>
>>> Terry Therneau
>>>
>>> My document "A Package for Survival Analysis in S" contains statements
>>> about how to fit Gompertz and Rayleigh distributions with the survreg
>>> routine. Nicholas Brouard, in a recent query to this group, quite
>>> correctly states that "Therneau's documentation is a little elliptic for
>>> people not so familiar with extreme value theory".
>>>
>>> I've spent the last day trying to work out concrete examples of the
>>> fits. Let me start by saying that I now think my paper's remarks were
>>> overly optimistic. This note will try to indicate why. I will use some
>>> "TeX" notation below: \alpha, \beta, etc for Greek letters.
>>>
>>> Hazard functions:
>>>
>>> Weibull: p*(\lambda)^p * t^(p-1)
>>> Extreme value: (1/ \sigma) * exp( (t- \eta)/ \sigma)
>>> Rayleigh: a + bt
>>> Gompertz: b * c^t
>>> Makeham: a + b* c^t
>>>
>>> The Makeham hazard seems to fit human mortality experience beyond
>>> infancy quite well, where "a" is a constant mortality which is
>>> independent of the health of the subject (accidents, homicide, etc) and
>>> the second term models the Gompertz assumption that "the average
>>> exhaustion of a man's power to avoid death to is such that at theendof
>>> equal infinitely small itervals of time he lost equal portions of his
>>> remaining power to oppose destruction which he had at the commencement
>>> of these intervals". For older ages "a" is a neglible portion of the
>>> death rate and the Gompertz model holds.
>>>
>>> The fitting routine depends on the decomposition Y = \eta + \sigma W,
>>> where \eta = \beta_0 + \beta_1 * X_1 + \beta_2 * X_2 + ... is the fitted
>>> linear predictor and W is a distribution in "standard" form. For
>>> instance, if the response time t is Weibull, then y = log(t) follows
>>> this with \eta = log(\lambda) \sigma = 1/p
>>>
>>> Clearly
>>>
>>> 1. The Wiebull distribution with p=2 (sigma=.5) is the same as a
>>> Rayleigh distribution with a=0. It is not, however, the most general
>>> form of a Rayleigh.
>>>
>>> 2. The (least) extreme value and Gompertz distributions have the same
>>> hazard function, with \sigma = 1/ log(c), and exp(-\eta/ \sigma) = b.
If I do the math correctly from the above given extreme value hazard to 
Gompertz hazard. It needs to b = 1/ \sigma * exp(-\eta/ \sigma)
Other wise the 1/ \sigma of the Extreme value hazard is missing, isn't it?

>>>
>>> It would appear that the Gompertz can be fit with an identity link
>>> function combined with the extreme value distribution. However, this
>>> ignores a boundary restriction. If f(x; \eta, \sigma) is the extreme
>>> value distribution with paramters \eta and \sigma, then the definition
>>> of the Gompertz densitiy is g(x; \eta, \sigma) = 0 x< 0 g(x; \eta,
>>> \sigma) = c f(x; \eta, \sigma) x>=0
>>>
>>> where c= exp(exp(-\eta / \sigma)) is the necessary constant so that g
>>> integrates to 1.
Here I got really lost were the addition double exp suddenly comes from 
and how it fits in.
Given the above I would have thought that:
g(x,b,c) = f(x, \eta=-1/log(c)*log(b*1/log(c)), \sigma= 1/log(c) ) for x>=0

Can anyone clarify these thinga to me, please?

Matthias

>>> If \eta / \sigma is far from 1, then the correction
>>> term will be minimal and survreg should give a reasonable answer. If
>>> not, the distribution can't be fit, nor can it be made to easily conform
>>> to the general fitting scheme of the program.
>>>
>>> The Makeham distribution falls into the gamma family (equation 2.3 of
>>> Kalbfleisch and Prentice, Survival Analysis), but with the same range
>>> restriction problem.
>>>
>>> In summary, the Gompertz is a truncated form of the extreme value
>>> distribution (Johnson, Kotz and Blakrishnan, Contiuous Univariate
>>> Distri- butions, section 22.8). If one ignores the truncation, i.e.,
>>> assume that negative time values are possible, then it can be fit with
>>> survreg. My original note seems to have been compounded of 3 errors: the
>>> -1 arises from confusing the maximal extreme distribution (most common
>>> in theory books) with the minimal extreme distribution (used in
>>> survreg), the log() term was a typing mistake, and I never noticed the
>>> range restriction.
>>>
>>> This is one of the few topics in the report without a worked example as
>>> part of my test library (the Examples section of the package). The
>>> replacement document, currently in early draft, is intended to have a
>>> worked example for every claim and the code for that example in the
>>> appendix. This will, hopefully, cure any other mistakes of this sort.
>>>
>>
>> Is it possible to implement the gompertz distribution as survreg.distribution to use with survreg of the survival library?
>> I haven't found anything and recent attempts from my side weren't succefull so far.
>>
>> I know that other packages like 'eha' and 'flexsurv' offer functions similar to survreg with gompertz support. However, due to the run-time environment were this needs to be running in the end, I can't use these packages :(
>>
>> Same questions for the gompertz-makeham distribution.
>>
>> Many thanks!
>>
>> Matthias




More information about the R-help mailing list