[R] Apparently Conflicting Results with coxph

Tue Oct 2 18:33:40 CEST 2007

On Tue, 2 Oct 2007, Kevin E. Thorpe wrote:

> Kevin E. Thorpe wrote:
>> Peter Dalgaard wrote:
>>> Kevin E. Thorpe wrote:
>>>> Dear List:
>>>>
>>>> I have a data frame prepared in the couting process style for including
>>>> a binary time-dependent covariate.  The first few rows look like this.
>>>>
>>>>     PtNo Start    End Status Imp
>>>> 1      1     0  608.0      0   0
>>>> 2      2     0  513.0      0   0
>>>> 3      2   513  887.0      0   1
>>>> 4      3     0   57.0      0   0
>>>> 5      3    57  604.0      0   1
>>>> 6      4     0  150.0      1   0
>>>>
>>>>
>>>> The outcome is mortality and the covariate is for an implantable
>>>> defibrillator, so it is expected that the implant would reduce the
>>>> risk of death.  The results of fitting coxph (survival package) are:
>>>>
>>>> Call:
>>>> coxph(formula = Surv(Start, End, Status) ~ Imp, data = nina.excl)
>>>>
>>>>
>>>>      coef exp(coef) se(coef)     z    p
>>>> Imp 0.163      1.18    0.485 0.337 0.74
>>>>
>>>> Likelihood ratio test=0.11  on 1 df, p=0.738  n= 335
>>>>
>>>> Since this was unexpected, I created a non-counting process data
>>>> frame with an indicator variable representing received an implant
>>>> or not.  Here are the results:
>>>>
>>>> Call:
>>>> coxph(formula = Surv(Days, Dead) ~ Implant, data = nina.excl0)
>>>>
>>>>
>>>>          coef exp(coef) se(coef)     z       p
>>>> Implant -1.77     0.171    0.426 -4.15 3.3e-05
>>>>
>>>> Likelihood ratio test=19.1  on 1 df, p=1.21e-05  n= 197
>>>>
>>>> I found this degree of discrepancy surprising, especially the point
>>>> estimate of the coefficient.  I have verified the data frames are
>>>> set up correctly.
>>>>
>>>> Here is what I have tried to understand what is going on.
>>>>
>>>> I tried fitting models adjusted for other covariates that I have in
>>>> the data frame.  This did not appreciably affect the coefficients
>>>> for the implant variable.
>>>>
>>>> I ran cox.zph on the two models shown above and plotted the results.
>>>> In both cases, the point estimate of Beta(t) is sort of parabolic
>>>> in that the curves are monotonically increasing to a local maximum
>>>> after which they are monotonically decreasing (the CIs are a bit
>>>> more wiggly).
>>>>
>>>> I would interpret this to mean that the effect of implant is probably
>>>> time-dependent.  If so, how do I actually get a "proper" estimate of
>>>> beta(t) for a variable like this?
>>>>
>>>> Are there some other things I should look at to understand what's
>>>> going on?
>>>>
>>>>
>>> If you want to play with time-dependent regression coefficients have a
>>> look at the timereg package and the book that it supports.
>>>
>>> However, first you need to consider the possibility of selection effects
>>> that can take place even with non-varying effects. In the case at hand I
>>> would suspect a bias created by the fact that you don't implant devices
>>> into people who are already dead.
>>>
>>
>> Thanks.  The point in your last paragraph did cross my mind too.
>>
>
> I thought about this some more, and I'm not sure that possibility is
> "to blame."  In my time-dependent model, I don't think I'm doing
> anything different than is done for transplant in the Stanford
> Heart Study (the often used example for this kind of time-dependent
> covariate).  As in my case, you would not transplant a dead patient.
> So, I remain puzzled as to why my model is misbehaving.
>

Indeed, the "bias created" that Peter alludes to is precisely because you 
would not implant a device in a dead patient that the implantation of such 
a device has an apparent negative effect on hazard of mortality.

If a patient is enrolled and a date is set (perhaps implicitly) for 
implanting the device, then a patient who survives longer is more likely 
to receive the device.

Here is a setup in which patients have exponential hazard and are 
scheduled to receive devices uniformly over a fixed time interval 
following their enrollment. As you see the time of intended implant is 
independent of the time of death by construction, but the apparent effect 
of implantation is negative.

> time.of.death <- rexp(1000)
> time.of.implant <- runif(1000,0,2)
> implant.occurred <- time.of.death > time.of.implant
> coxph( Surv( time.of.death ) ~ implant.occurred )
Call: coxph(formula = Surv(time.of.death) ~ implant.occurred)

                       coef exp(coef) se(coef)     z p
implant.occurredTRUE -1.80     0.165    0.084 -21.5 0

Likelihood ratio test=519  on 1 df, p=0  n= 1000

HTH,

Chuck

>
> -- 
> Kevin E. Thorpe
> Biostatistician/Trialist, Knowledge Translation Program
> Assistant Professor, Department of Public Health Sciences
> Faculty of Medicine, University of Toronto
> email: kevin.thorpe at utoronto.ca  Tel: 416.864.5776  Fax: 416.864.6057
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901