[R] comparing SAS and R survival analysis with time-dependent covariates

Terry Therneau therneau at mayo.edu
Fri Dec 5 14:56:25 CET 2008


  This query of "why do SAS and S give different answers for Cox models" comes 
up every so often.  The two most common reasons are that
  	a. they are using different options for the ties
  	b. the SAS and S data sets are slightly different.
You have both errors.

First, make sure I have the same data set by reading a common file, and then
compare the results.

tmt54% more sdata.txt
 1   0.0  0.5     0       0
 1   0.5  3.0     1       1
 2   0.0  1.0     0       0
 2   1.0  1.5     1       1
 3   0.0  6.0     0       0
 4   0.0  8.0     0       1
 5   0.0  1.0     0       0
 5   1.0  8.0     1       0
 6   0.0 21.0     0       1
 7   0.0  3.0     0       0
 7   3.0 11.0     1       1

tmt55% more test.sas
options linesize=80;

data trythis;
    infile 'sdata.txt';
    input id start end delir outcome;

proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=discrete;

proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=efron;


tmt56% more test.r
trythis <- read.table('sdata.txt',
                      col.names=c("id", "start", "end", "delir", "outcome"))

coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact')
coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron')

-----------------
 I now get comparable answers.  Note that Cox's "exact partial likelihood" is 
the correct form to use for discrete time data.  I labeled this as the 'exact' 
method and SAS as the 'discrete' method.  The "exact marginal likelihood" of 
Prentice et al, which SAS calls the 'exact' method is not implemented in S.
 
  As to which package is more reliable, I can only point to a set of formal test 
cases that are found in Appendix E of the book by Therneau and Grambsch.  These 
are small data sets where the coefficients, log-likelihood, residuals, etc have 
all been worked out exactly in closed form.  R gets all of these test cases 
right, SAS gets almost all.
 
 	Terry Therneau
 	
-----------------------------------------
Svetlan Eden wrote
Dear R-help,

I was comparing SAS (I do not know what version it is) and R (version 
2.6.0 (2007-10-03) on Linux) survival analyses with time-dependent 
covariates. The results differed significantly so I tried to understand 
on a short example where I went wrong. The following example shows that 
even when argument 'method' in R function coxph and argument 'ties' in 
SAS procedure phreg are the same, the results of Cox regr.  are 
different. This seems to happen when there are ties in the 
events/covariates times.

My question is what software, R or SAS, is more reliable for the 
survival analysis with time-dependent covariates or if you could point 
out a problem in the following example.

 ...



More information about the R-help mailing list