[R] coxph and ordinal variables?

Paul Johnson pauljohn32 at gmail.com
Thu Sep 9 04:00:08 CEST 2010


run it with factor() instead of ordered().  You don't want the
"orthogonal polynomial" contrasts that result from ordered if you need
to compare against Stata.

I attach an R program that I wrote to explore ordered factors a while
agol I believe this will clear everything up if you study the
examples.

pj

On Wed, Sep 8, 2010 at 5:43 PM, Min-Han Tan <minhan.science at gmail.com> wrote:
> Dear R-help members,
>
> Apologies - I am posting on behalf of a colleague, who is a little puzzled
> as STATA and R seem to be yielding different survival estimates for the same
> dataset when treating a variable as ordinal. Ordered() is used  to represent
> an ordinal variable) I understand that R's coxph (by default) uses the Efron
> approximation, whereas STATA uses (by default) the Breslow. but we did
> compare using the same approximations. I am wondering if this is a result of
> how coxph manages an ordered factor?
>
> Essentially, this is a survival dataset using tumor grade (1, 2, 3 and 4) as
> the risk factor. This is more of an 'ordinal' variable, rather than a
> continuous variable. For the same data set of 399 patients, when treating
> the vector of tumor grade as a continuous variable (range of 1 to 4),
> testing the Efron and the Breslow approximations yield the same result in
> both R and STATA.
>
> However, when Hist_Grade_4 grp is converted into an ordered factor using
> ordered(), and the same scripts are applied, rather different results are
> obtained, relative to the STATA output. This is tested across the different
> approximations, with consistent results. The comparison using Efron
> approximation and ordinal data is is below.
>
> Your advice is very much appreciated!
>
> Min-Han
>
> Apologies below for the slightly malaligned output.
>
> STATA output
>
> . xi:stcox i.Hist_Grade_4grp, efr
> i.Hist_Grade_~p   _IHist_Grad_1-4     (naturally coded; _IHist_Grad_1
> omitted)
>
>        failure _d:  FFR_censor
>  analysis time _t:  FFR_month
>
> Iteration 0:   log likelihood =  -1133.369
> Iteration 1:   log likelihood = -1129.4686
> Iteration 2:   log likelihood = -1129.3196
> Iteration 3:   log likelihood = -1129.3191
> Refining estimates:
> Iteration 0:   log likelihood = -1129.3191
>
> Cox regression -- Efron method for ties
>
> No. of subjects =          399                     Number of obs   =
> 399
> No. of failures =          218
> Time at risk    =  9004.484606
>                                                  LR chi2(3)      =
>  8.10
> Log likelihood  =   -1129.3191                     Prob > chi2     =
>  0.0440
>
> ------------------------------------------------------------------------------
>         _t | Haz. Ratio   Std. Err.      z    P>|z|     [95% Conf.
> Interval]
> -------------+----------------------------------------------------------------
> _IHist_Gra~2 |   1.408166   .3166876     1.52   0.128     .9062001
>  2.188183
> _IHist_Gra~3 |    1.69506   .3886792     2.30   0.021     1.081443
>  2.656847
> _IHist_Gra~4 |   2.540278   .9997843     2.37   0.018      1.17455
> 5.49403
>
>
>
> R Output using
>> summary ( coxph( Surv(FFR_month,FFR_censor) ~ Hist_Grade_4grp,
> method=c("breslow")))
>> summary ( coxph( Surv(FFR_month,FFR_censor) ~ Hist_Grade_4grp,
> method=c("exact")))
>> summary ( coxph( Surv(FFR_month,FFR_censor) ~ Hist_Grade_4grp,
> method=c("efron")))
>
>
>
>  n=399 (21 observations deleted due to missingness)
>
>                    coef exp(coef) se(coef)     z Pr(>|z|)
> Hist_Grade_4grp.L 0.66685   1.94809  0.26644 2.503   0.0123 *
> Hist_Grade_4grp.Q 0.03113   1.03162  0.20842 0.149   0.8813
> Hist_Grade_4grp.C 0.08407   1.08771  0.13233 0.635   0.5252
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>                 exp(coef) exp(-coef) lower .95 upper .95
> Hist_Grade_4grp.L     1.948     0.5133    1.1556     3.284
> Hist_Grade_4grp.Q     1.032     0.9693    0.6857     1.552
> Hist_Grade_4grp.C     1.088     0.9194    0.8392     1.410
>
> Rsquare= 0.02   (max possible= 0.997 )
> Likelihood ratio test= 8.1  on 3 df,   p=0.044
> Wald test            = 8.02  on 3 df,   p=0.0455
> Score (logrank) test = 8.2  on 3 df,   p=0.04202
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas


More information about the R-help mailing list