[R] Why is transform="km" the default for cox.zph?

Kevin E. Thorpe kevin.thorpe at utoronto.ca
Mon Apr 17 15:37:52 CEST 2006


At the suggestion of Thomas Lumley, I posted this to s-news.
Dr. Therneau replied and I have posted (with permission) his
answer below the original question.

Kevin E. Thorpe wrote:
> To enhance my understanding, and that of my students, I have a question
> about cox.zph in the survival package.
> 
> If I have correctly gleaned the high-level point from the 1994
> Biometrika paper of Grambsch and Therneau, it looks to me like
> cox.zph provides a mechanism to test for a simple trend in plots
> of a function of time, g(t) versus the scaled schoenfeld
> residuals and it also provides some built-in ones and the capability
> to provide your own.  It also appears to me that different forms look
> at different departures from proportionality.
> 
> So, my question is what are the advantages and disadvantages of the
> default transform="km" compared to say, identity or log?
> 
> Thank you.
> 
> Kevin
> 

=== Begin Dr. Therneau's Reply ===

There are 2 reasons for making the KM the default:

  1. Safety:  The test for PH is essentially a least-squares fit of
     line to a plot of f(time) vs residual.  If the plot contains an
     extreme oulier in x, then the test is basically worthless.  This
     sometimes happens with transform= identity or transform =log.
     It doesn't with transform='KM'.

     As a default value for naive users, I chose the safe course.

  2. A secondary reason is efficiency.  In DY Lin, JASA 1991
     Dan-Yu argues that this is a "good" test statistic under various
     assumptions about censoring. (His measure has the same score
     statistics as the KM option).

But #1 is the big one.

     Terry T.

=== End Dr. Therneau's Reply ===


-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: kevin.thorpe at utoronto.ca  Tel: 416.946.8081  Fax: 416.946.3297




More information about the R-help mailing list