[R] Apparently Conflicting Results with coxph

Tue Oct 2 14:16:02 CEST 2007

Kevin E. Thorpe wrote:
> Peter Dalgaard wrote:
>> Kevin E. Thorpe wrote:
>>> Dear List:
>>>
>>> I have a data frame prepared in the couting process style for including
>>> a binary time-dependent covariate.  The first few rows look like this.
>>>
>>>     PtNo Start    End Status Imp
>>> 1      1     0  608.0      0   0
>>> 2      2     0  513.0      0   0
>>> 3      2   513  887.0      0   1
>>> 4      3     0   57.0      0   0
>>> 5      3    57  604.0      0   1
>>> 6      4     0  150.0      1   0
>>>
>>>
>>> The outcome is mortality and the covariate is for an implantable
>>> defibrillator, so it is expected that the implant would reduce the
>>> risk of death.  The results of fitting coxph (survival package) are:
>>>
>>> Call:
>>> coxph(formula = Surv(Start, End, Status) ~ Imp, data = nina.excl)
>>>
>>>
>>>      coef exp(coef) se(coef)     z    p
>>> Imp 0.163      1.18    0.485 0.337 0.74
>>>
>>> Likelihood ratio test=0.11  on 1 df, p=0.738  n= 335
>>>
>>> Since this was unexpected, I created a non-counting process data
>>> frame with an indicator variable representing received an implant
>>> or not.  Here are the results:
>>>
>>> Call:
>>> coxph(formula = Surv(Days, Dead) ~ Implant, data = nina.excl0)
>>>
>>>
>>>          coef exp(coef) se(coef)     z       p
>>> Implant -1.77     0.171    0.426 -4.15 3.3e-05
>>>
>>> Likelihood ratio test=19.1  on 1 df, p=1.21e-05  n= 197
>>>
>>> I found this degree of discrepancy surprising, especially the point
>>> estimate of the coefficient.  I have verified the data frames are
>>> set up correctly.
>>>
>>> Here is what I have tried to understand what is going on.
>>>
>>> I tried fitting models adjusted for other covariates that I have in
>>> the data frame.  This did not appreciably affect the coefficients
>>> for the implant variable.
>>>
>>> I ran cox.zph on the two models shown above and plotted the results.
>>> In both cases, the point estimate of Beta(t) is sort of parabolic
>>> in that the curves are monotonically increasing to a local maximum
>>> after which they are monotonically decreasing (the CIs are a bit
>>> more wiggly).
>>>
>>> I would interpret this to mean that the effect of implant is probably
>>> time-dependent.  If so, how do I actually get a "proper" estimate of
>>> beta(t) for a variable like this?
>>>
>>> Are there some other things I should look at to understand what's
>>> going on?
>>>
>>>   
>> If you want to play with time-dependent regression coefficients have a
>> look at the timereg package and the book that it supports.
>>
>> However, first you need to consider the possibility of selection effects
>> that can take place even with non-varying effects. In the case at hand I
>> would suspect a bias created by the fact that you don't implant devices
>> into people who are already dead.
>>
> 
> Thanks.  The point in your last paragraph did cross my mind too.
> 

I thought about this some more, and I'm not sure that possibility is
"to blame."  In my time-dependent model, I don't think I'm doing
anything different than is done for transplant in the Stanford
Heart Study (the often used example for this kind of time-dependent
covariate).  As in my case, you would not transplant a dead patient.
So, I remain puzzled as to why my model is misbehaving.

-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: kevin.thorpe at utoronto.ca  Tel: 416.864.5776  Fax: 416.864.6057