[R] comparing SAS and R survival analysis with time-dependent covariates
Terry Therneau
therneau at mayo.edu
Fri Dec 5 14:56:25 CET 2008
This query of "why do SAS and S give different answers for Cox models" comes
up every so often. The two most common reasons are that
a. they are using different options for the ties
b. the SAS and S data sets are slightly different.
You have both errors.
First, make sure I have the same data set by reading a common file, and then
compare the results.
tmt54% more sdata.txt
1 0.0 0.5 0 0
1 0.5 3.0 1 1
2 0.0 1.0 0 0
2 1.0 1.5 1 1
3 0.0 6.0 0 0
4 0.0 8.0 0 1
5 0.0 1.0 0 0
5 1.0 8.0 1 0
6 0.0 21.0 0 1
7 0.0 3.0 0 0
7 3.0 11.0 1 1
tmt55% more test.sas
options linesize=80;
data trythis;
infile 'sdata.txt';
input id start end delir outcome;
proc phreg data=trythis;
model (start, end)*outcome(0)=delir/ ties=discrete;
proc phreg data=trythis;
model (start, end)*outcome(0)=delir/ ties=efron;
tmt56% more test.r
trythis <- read.table('sdata.txt',
col.names=c("id", "start", "end", "delir", "outcome"))
coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact')
coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron')
-----------------
I now get comparable answers. Note that Cox's "exact partial likelihood" is
the correct form to use for discrete time data. I labeled this as the 'exact'
method and SAS as the 'discrete' method. The "exact marginal likelihood" of
Prentice et al, which SAS calls the 'exact' method is not implemented in S.
As to which package is more reliable, I can only point to a set of formal test
cases that are found in Appendix E of the book by Therneau and Grambsch. These
are small data sets where the coefficients, log-likelihood, residuals, etc have
all been worked out exactly in closed form. R gets all of these test cases
right, SAS gets almost all.
Terry Therneau
-----------------------------------------
Svetlan Eden wrote
Dear R-help,
I was comparing SAS (I do not know what version it is) and R (version
2.6.0 (2007-10-03) on Linux) survival analyses with time-dependent
covariates. The results differed significantly so I tried to understand
on a short example where I went wrong. The following example shows that
even when argument 'method' in R function coxph and argument 'ties' in
SAS procedure phreg are the same, the results of Cox regr. are
different. This seems to happen when there are ties in the
events/covariates times.
My question is what software, R or SAS, is more reliable for the
survival analysis with time-dependent covariates or if you could point
out a problem in the following example.
...
More information about the R-help
mailing list