[R] Trend test for hazard ratios

Tue Apr 15 22:51:15 CEST 2014

On Apr 15, 2014, at 6:32 AM, Therneau, Terry M., Ph.D. wrote:

> You can do statistical tests within a single model, for whether portions of it fit or do not fit.  But one cannot take three separate fits and compare them.  The program needs context to know how the three relate to one another.  Say that "group" is your strata variable, trt the variable of interest, and x1, x2 are adjusters.
> 
>   fit <- coxph(Surv(time,status) ~ trt * strata(group) + x1 + x2, data=mydata)
> 
> Will fit a model with a separate treatment coefficient for each of the groups, and a separate baseline hazard for each.  One can now create a contrast that corresponds to your trend test, using vcov(fit) for the variance matrix and coef(fit) to retrieve the coefficients.
> 

I have at the moment on my workspace a dataset of breast cancer cases extracted from SEER that has a factor representing three grades of histology: 
$Grade.abb. $ Grade.abb     : Factor w/ 3 levels "Well","Moderate", "Poorly"

It would be reasonable to test whether the grading has a "trend" in its effect when controlling for other factors (and it would be surprising to a medical audience if there were no effect.). So I compare across models using  AgeStgSiz.NG.Rad as my "linear" trend" model (with one df for an `as.numeric` version, AgeStgSizRad as my no-grade-included baseline, and AgeStgSiz.FG.Rad as my full factor version:

> AgeStgSiz.NG.Rad <- coxph( Surv(Survmon, Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+ SEER.historic.stage.A+ as.numeric(Size.cat) + as.numeric(Grade.abb) + Rgrp , data=BrILR)
> AgeStgSizRad <- coxph( Surv(Survmon, Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+ SEER.historic.stage.A+ as.numeric(Size.cat) + Rgrp , data=BrILR)
> AgeStgSiz.FG.Rad <- coxph( Surv(Survmon, Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+ SEER.historic.stage.A+ as.numeric(Size.cat) + Grade.abb + Rgrp , data=BrILR)
> 2*diff(summary(AgeStgSizRad)[['loglik']])
[1] 5166.291
> 2*diff(summary(AgeStgSiz.NG.Rad)[['loglik']])
[1] 5888.492
> 2*diff(summary(AgeStgSiz.FG.Rad)[['loglik']])
[1] 5889.379

So there is strong evidence that adding grade to the existing covariates improves the fit but that representing as separate factor values with one extra degree of freedom may not be needed. When I add grade as a stratum I get a very different 2*loglikelihood:

> AgeStgSiz.SG.Rad <- coxph( Surv(Survmon, Vital.status.recode..study.cutoff.used.=="Dead") ~ AgeDx+ SEER.historic.stage.A+ as.numeric(Size.cat) + strata(Grade.abb) + Rgrp , data=BrILR)
> 2*diff(summary(AgeStgSiz.SG.Rad)[['loglik']])
[1] 3980.908

> dput( vcov(AgeStgSiz.SG.Rad) )
structure(c(9.00241385282728e-06, -4.45446264398645e-07, 5.18927440846587e-07, 
2.62020260612094e-07, 7.47434378232446e-07, -4.45446264398645e-07, 
0.0046168537719431, 0.00445692601518848, -8.67833275051278e-07, 
-3.68093395861629e-05, 5.18927440846587e-07, 0.00445692601518848, 
0.00464685164887969, -1.61616621634903e-05, -3.77256742079467e-05, 
2.62020260612094e-07, -8.67833275051278e-07, -1.61616621634903e-05, 
7.84049821976807e-06, 1.8221575745622e-06, 7.47434378232446e-07, 
-3.68093395861629e-05, -3.77256742079467e-05, 1.8221575745622e-06, 
0.000313989310316303), .Dim = c(5L, 5L), .Dimnames = list(c("AgeDx", 
"SEER.historic.stage.ALocalized", "SEER.historic.stage.ARegional", 
"as.numeric(Size.cat)", "RgrpTRUE"), c("AgeDx", "SEER.historic.stage.ALocalized", 
"SEER.historic.stage.ARegional", "as.numeric(Size.cat)", "RgrpTRUE"
)))
> dput(coef(AgeStgSiz.SG.Rad))
structure(c(0.0393472050734995, 0.901971276489615, 1.46695753267995, 
0.108860100649677, -0.273688779502084), .Names = c("AgeDx", "SEER.historic.stage.ALocalized", 
"SEER.historic.stage.ARegional", "as.numeric(Size.cat)", "RgrpTRUE"
))

I'm not particularly facile with contrast construction with var-covar matrices, so hoping for a worked example. Also wondering of the cross-model comparisons are invalid or less powerful?

> Terry T.
> 
> 
> 
> On 04/15/2014 05:00 AM, r-help-request at r-project.org wrote:
>> Hello,
>> 
>> I have the following problem. I stratified my patient cohort into three
>> ordered groups and performed multivariate adjusted Cox regression analysis
>> on each group separately. Now I would like to calculate a p for trend across
>> the hazard ratios that I got for the three groups. How can I do that if I
>> only have the HR and the confidence interval? For example I got the
>> following HRs for one endpoint:
>> 
>> 1.09(0.68-1.74),	1.29(0.94-1.76) and 1.64(1.01-2.68).
>> 
>> There is a trend but how do I calculate if it is significant?
>> 
>> Best regards
>> 
>> Marcus Kleber
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA