[R] coxph and ordinal variables?
David Winsemius
dwinsemius at comcast.net
Thu Sep 9 01:02:46 CEST 2010
On Sep 8, 2010, at 6:43 PM, Min-Han Tan wrote:
> Dear R-help members,
>
> Apologies - I am posting on behalf of a colleague, who is a little
> puzzled
> as STATA and R seem to be yielding different survival estimates for
> the same
> dataset when treating a variable as ordinal. Ordered() is used to
> represent
> an ordinal variable) I understand that R's coxph (by default) uses
> the Efron
> approximation, whereas STATA uses (by default) the Breslow. but we did
> compare using the same approximations. I am wondering if this is a
> result of
> how coxph manages an ordered factor?
>
> Essentially, this is a survival dataset using tumor grade (1, 2, 3
> and 4) as
> the risk factor. This is more of an 'ordinal' variable, rather than a
> continuous variable. For the same data set of 399 patients, when
> treating
> the vector of tumor grade as a continuous variable (range of 1 to 4),
> testing the Efron and the Breslow approximations yield the same
> result in
> both R and STATA.
>
> However, when Hist_Grade_4 grp is converted into an ordered factor
> using
> ordered(), and the same scripts are applied, rather different
> results are
> obtained, relative to the STATA output. This is tested across the
> different
> approximations, with consistent results. The comparison using Efron
> approximation and ordinal data is is below.
Are you sure you want an ordered factor? In R this means you will be
creating linear, quadratic and cubic contrasts. Notice the L, Q and C
designations on the coefficients. That certainly does not look to be
comparable to what you are getting from Stata. My suggestion would be
to create an un-ordered factor in R and see whether you get results
more in line with Stata's output when applied to your data.
--
David.
>
> Your advice is very much appreciated!
>
> Min-Han
>
> Apologies below for the slightly malaligned output.
>
> STATA output
>
> . xi:stcox i.Hist_Grade_4grp, efr
> i.Hist_Grade_~p _IHist_Grad_1-4 (naturally coded; _IHist_Grad_1
> omitted)
>
> failure _d: FFR_censor
> analysis time _t: FFR_month
>
> Iteration 0: log likelihood = -1133.369
> Iteration 1: log likelihood = -1129.4686
> Iteration 2: log likelihood = -1129.3196
> Iteration 3: log likelihood = -1129.3191
> Refining estimates:
> Iteration 0: log likelihood = -1129.3191
>
> Cox regression -- Efron method for ties
>
> No. of subjects = 399 Number of obs =
> 399
> No. of failures = 218
> Time at risk = 9004.484606
> LR chi2(3) =
> 8.10
> Log likelihood = -1129.3191 Prob > chi2 =
> 0.0440
>
> ------------------------------------------------------------------------------
> _t | Haz. Ratio Std. Err. z P>|z| [95% Conf.
> Interval]
> -------------
> +----------------------------------------------------------------
> _IHist_Gra~2 | 1.408166 .3166876 1.52 0.128 .9062001
> 2.188183
> _IHist_Gra~3 | 1.69506 .3886792 2.30 0.021 1.081443
> 2.656847
> _IHist_Gra~4 | 2.540278 .9997843 2.37 0.018 1.17455
> 5.49403
>
>
>
> R Output using
>> summary ( coxph( Surv(FFR_month,FFR_censor) ~ Hist_Grade_4grp,
> method=c("breslow")))
>> summary ( coxph( Surv(FFR_month,FFR_censor) ~ Hist_Grade_4grp,
> method=c("exact")))
>> summary ( coxph( Surv(FFR_month,FFR_censor) ~ Hist_Grade_4grp,
> method=c("efron")))
>
>
>
> n=399 (21 observations deleted due to missingness)
>
> coef exp(coef) se(coef) z Pr(>|z|)
> Hist_Grade_4grp.L 0.66685 1.94809 0.26644 2.503 0.0123 *
> Hist_Grade_4grp.Q 0.03113 1.03162 0.20842 0.149 0.8813
> Hist_Grade_4grp.C 0.08407 1.08771 0.13233 0.635 0.5252
> ---
> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
>
> exp(coef) exp(-coef) lower .95 upper .95
> Hist_Grade_4grp.L 1.948 0.5133 1.1556 3.284
> Hist_Grade_4grp.Q 1.032 0.9693 0.6857 1.552
> Hist_Grade_4grp.C 1.088 0.9194 0.8392 1.410
>
> Rsquare= 0.02 (max possible= 0.997 )
> Likelihood ratio test= 8.1 on 3 df, p=0.044
> Wald test = 8.02 on 3 df, p=0.0455
> Score (logrank) test = 8.2 on 3 df, p=0.04202
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list