[R] Presenting Hazard ratios for interacting variables in a Cox model
David Winsemius
dwinsemius at comcast.net
Mon Nov 21 20:59:45 CET 2016
> On Nov 21, 2016, at 4:21 AM, Stuart Patterson <stuartjpatterson at googlemail.com> wrote:
>
> Dear David,
> Thank you for your reply. Your suggestions on how better to write the command are very useful, and I can see how the simplification would help. I hadn't realised that the lower order variables would be included if I simply wrote in teh interaction terms. Thank you
>
> The table below is how I feel that I ought to present my results. If anyone has any suggestions on how to obtain these values I would be very grateful!
You should be able to fill in the first empty column using the results of coef(model) and
expand.grid( Year=levels(Year), Prev_TB=levels(Prev_TB), Age=levels(Age) )
Calculating the CI's by hand would require knowing the covariance matrix and you should be able to get the needed components from vcov(model). I wondered if using the confint function would be a more economical approach, however, when I look at the code of confint.default, it does not appear that it would handle contrasts involving interactions terms in a manner that I would think was totally correct. It appears to ignore the off-diagonal elements of the vcov matrix. Perhaps using the confint function in the multcomp package would yield more correct estimates. It's documented as "averaging over the interactions", however the code appears to use only the diagonal elements as well.
I think the "right way" is outlined by Dalgaard and Laviolette in a recent Rhelp thread:
https://stat.ethz.ch/pipermail/r-help/2016-August/440879.html
You would set up the D matrix to represent levels in the grid of values to match the contrasts being considered. I did find a phia package that might automate this process, but I don't have experience with it.
Your table is probably only readable by me. The rest of the Rhelp world got a single column because you posting in HTML. Do learn the post to Rhelp in plain text.
I dropped it into OpenOffice.org, saved as csv/txt, and read it into R:
> read.csv("~/Documents/Untitled 1.csv")
Variable X X.1 HRs ci.95. p.Value
1 Year Prev. TB Age
2 2002-2005 No < 24 months
3 24-48 months
4 >48 months
5 Yes < 24 months
6 24-48 months
7 >48 months
8 2006-2010 No < 24 months
9 24-48 months
10 >48 months
11 Yes < 24 months
12 24-48 months a b c
13 >48 months
14 2011-2015 No < 24 months
15 24-48 months
16 >48 months
17 Yes < 24 months
18 24-48 months
19 >48 months
--
David
>
> On 18 November 2016 at 19:21, David Winsemius <dwinsemius at comcast.net> wrote:
>
> > On Nov 18, 2016, at 6:56 AM, Stuart Patterson via R-help <r-help at r-project.org> wrote:
> >
> > I have a time-dependent cox model with three variables, each of which
> > interacts with the other two. So my final model is:
> >
> > fit12<-coxph(formula = Surv(data$TimeIn, data$Timeout, data$Status) ~ data$
> > Year+data$Life_Stg+data$prev.tb +data$prev.tb*data$Life_Stg + data$Year*data
> > $Life_Stg + data$Year*data$prev.tb + frailty(data$Natal_Group), data = data)
>
> It seems fairly likely that you are shooting yourself in the foot by using the `data$variate` inside the formula. It will prevent the regression result from having correctly assembled references to variables. And that will become evident when you try to do any predictions. Try instead:
>
> fit12<-coxph(formula = Surv( TimeIn, Timeout, Status) ~
> prev.tb * Life_Stg + Year *Life_Stg + Year * prev.tb + frailty( Natal_Group),
> data = data)
>
> The `*` in a formula automatically includes the lower order individual variates in the estimates. Your model RHS could have also been written (more clearly in my opinion):
>
> ~ (prev.tb + Life_Stg + Year)^2
>
> ... since R formulas interpret the `(.)^N` operation as "all base effects and interactions up to order N".
>
> >
> > For my variables, there are 3 categories of year, three of year, and
> > prev.tb is a binary variable. Because of the interactions, when I present
> > the results, I want to present the Hazard ratio, 95% CI, and p value for
> > each combination of the three variables. How do I get R to give me these
> > values please?
> >
> > I think that the contrast function does this for other models but does not
> > work for coxph?
>
> The usual method would be to use `predict` on a 'newdata' dataframe with all the combinations generated from `expand.grid`. The combination of the reference values of all three variables should yield a 1.0 hazard ratio. But time-dependent model predictions need a complete specification of a sequence of values over the time course of the study (as well as specification of the frailty term. So I'm not in a position to comment on feasibility for this situation.
>
> >
> > Grateful for any suggestions
> >
> > Best wishes
> >
> > Stuart Patterson, Royal Veterinary College, University of London
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list