[R] comparing SAS and R survival analysis with time-dependent covariates

AO_Statistics aboueslati at gmail.com
Tue Jul 19 19:42:02 CEST 2011


Terry Therneau-2 wrote:
> 
> This query of "why do SAS and S give different answers for Cox models"
> comes 
> up every so often.  The two most common reasons are that
>   	a. they are using different options for the ties
>   	b. the SAS and S data sets are slightly different.
> You have both errors.
> 
> First, make sure I have the same data set by reading a common file, and
> then
> compare the results.
> 
> tmt54% more sdata.txt
>  1   0.0  0.5     0       0
>  1   0.5  3.0     1       1
>  2   0.0  1.0     0       0
>  2   1.0  1.5     1       1
>  3   0.0  6.0     0       0
>  4   0.0  8.0     0       1
>  5   0.0  1.0     0       0
>  5   1.0  8.0     1       0
>  6   0.0 21.0     0       1
>  7   0.0  3.0     0       0
>  7   3.0 11.0     1       1
> 
> tmt55% more test.sas
> options linesize=80;
> 
> data trythis;
>     infile 'sdata.txt';
>     input id start end delir outcome;
> 
> proc phreg data=trythis;
>   model (start, end)*outcome(0)=delir/ ties=discrete;
> 
> proc phreg data=trythis;
>   model (start, end)*outcome(0)=delir/ ties=efron;
> 
> 
> tmt56% more test.r
> trythis <- read.table('sdata.txt',
>                       col.names=c("id", "start", "end", "delir",
> "outcome"))
> 
> coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='exact')
> coxph(Surv(start, end, outcome) ~ delir, data=trythis, ties='efron')
> 
> -----------------
>  I now get comparable answers.  Note that Cox's "exact partial likelihood"
> is 
> the correct form to use for discrete time data.  I labeled this as the
> 'exact' 
> method and SAS as the 'discrete' method.  The "exact marginal likelihood"
> of 
> Prentice et al, which SAS calls the 'exact' method is not implemented in
> S.
>  
>   As to which package is more reliable, I can only point to a set of
> formal test 
> cases that are found in Appendix E of the book by Therneau and Grambsch.  
> 
> [...]
> 
> 


I am processing estimations of regression parameters in the Cox model for
recurrent event data with time-dependent covariates. As my data sets contain
a lot of ties, I use the "discrete" method of SAS ("PHREG" procedure) and
"exact" option in R ("coxph" function of "survival" package).

Despite the high computation time (up to 45s), I always get estimations
without error or warning message with the "PHREG" procedure.
On the other hand, when I use R software (latest version 2.13.11 on 32 or 64
bits), I sometimes get different estimates from those obtained with SAS and
I get various warnings. And some other time I don't get any result, R
freezes and does not respond.

In order to understand, I have tried some tests from your examples. It turns
out that dysfunctions appear when the proportion of ties become important :

-------------------- Test1 --------------------

With R :

> (trythis <- read.table('***\\sdata4.txt',
+                       col.names=c("id", "start", "end", "delir",
"outcome")))
   id start end delir outcome
1   1   0.5 3.0     1       1
2   1   0.5 3.0     1       1
3   1   0.5 3.0     1       1
4   1   0.5 3.0     1       1
5   1   0.5 3.0     1       1
6   2   1.0 1.5     1       1
7   2   1.0 1.5     1       1
8   2   1.0 1.5     1       1
9   2   1.0 1.5     1       1
10  2   1.0 1.5     1       1
11  2   1.0 1.5     1       1
12  2   1.0 1.5     1       1
13  2   1.0 1.5     1       1
14  4   0.0 8.0     0       1
15  4   0.0 8.0     0       1
16  4   0.0 8.0     0       1
17  4   0.0 8.0     0       1
18  5   0.0 1.0     0       0
19  5   0.0 1.0     0       0
20  5   0.0 1.0     0       0
21  5   0.0 1.0     0       0
> coxph(Surv(start, end, outcome) ~ delir, data=trythis, method='exact')
Call:
coxph(formula = Surv(start, end, outcome) ~ delir, data = trythis, 
    method = "exact")


      coef exp(coef) se(coef)       z p
delir 22.5  6.06e+09    15460 0.00146 1

Likelihood ratio test=15.6  on 1 df, p=8.04e-05  n= 21, number of events= 17 
Message d'avis :
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Ran out of iterations and did not converge

With SAS :

data trythis ;
input id start end delir outcome;
datalines;
 1   0.5  3.0     1       1
 1   0.5  3.0     1       1
 1   0.5  3.0     1       1
 1   0.5  3.0     1       1
 1   0.5  3.0     1       1
 2   1.0  1.5     1       1
 2   1.0  1.5     1       1
 2   1.0  1.5     1       1
 2   1.0  1.5     1       1
 2   1.0  1.5     1       1
 2   1.0  1.5     1       1
 2   1.0  1.5     1       1
 2   1.0  1.5     1       1
 4   0.0  8.0     0       1
 4   0.0  8.0     0       1
 4   0.0  8.0     0       1
 4   0.0  8.0     0       1
 5   0.0  1.0     0       0
 5   0.0  1.0     0       0
 5   0.0  1.0     0       0
 5   0.0  1.0     0       0
run;
proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=discrete;
RUN;

No error message, results :

estimate delir : 20.52466
se : 5689
Pr > Khi 2 : 0.9971
convergence status : "Convergence criterion (GCONV=1E-8) satisfied."

-------------------- Test2 --------------------

With R :

> (trythis <- read.table('***\\montest.txt',
+                       col.names=c("id", "start", "end", "delir",
"outcome")))
   id start end delir outcome
1   1     1   2     1       1
2   1     1   2     0       1
3   1     1   2     0       1
4   1     1   2     1       1
5   1     1   2     0       1
6   1     1   2     1       1
7   2     1   2     1       1
8   2     1   2     1       0
9   3     8  10     0       0
10  4     8  10     0       0
> coxph(Surv(start, end, outcome) ~ delir, data=trythis, method='exact')
Call:
coxph(formula = Surv(start, end, outcome) ~ delir, data = trythis, 
    method = "exact")


       coef exp(coef) se(coef)         z p
delir -20.8  9.42e-10    42054 -0.000494 1

Likelihood ratio test=0.94  on 1 df, p=0.332  n= 10, number of events= 7 
Message d'avis :
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Loglik converged before variable  1 ; beta may be infinite. 

With SAS :

data trythis ;
input id start end delir outcome;
datalines;
 1   1.0  2.0     1       1
 1   1.0  2.0     0       1
 1   1.0  2.0     0       1
 1   1.0  2.0     1       1
 1   1.0  2.0     0       1
 1   1.0  2.0     1       1
 2   1.0  2.0     1       1
 2   1.0  2.0     1       0
 3   8.0  10.0    0       0
 4   8.0  10.0    0       0
run;
proc phreg data=trythis;
  model (start, end)*outcome(0)=delir/ ties=discrete;
RUN;

results :

estimate delir : -17.78257
se : 9383
Pr > Khi 2 : 0.9985
convergence status : "Convergence criterion (GCONV=1E-8) satisfied."



So I get 2 different warning messages and with my data sets it can be worse.
How would you explain these dysfunctions or warnings with "coxph" function
and how should I deal with it ? Do you think that the results obtained by
SAS are reliable in such cases ?

Thank you for your answer.



--
View this message in context: http://r.789695.n4.nabble.com/comparing-SAS-and-R-survival-analysis-with-time-dependent-covariates-tp874438p3678763.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list