[R] What's the baseline model when using coxph with factor variables?

David Winsemius dwinsemius at comcast.net
Thu Dec 1 20:56:13 CET 2011


On Dec 1, 2011, at 1:00 PM, William Dunlap wrote:

> Terry will correct me if I'm wrong, but I don't think the
> answer to this question is specific to the coxph function.

It depends on our interpretation of the questioner's intent. My answer  
was predicated on the assumption that the phrase "baseline model"  
meant baseline survival function, ... S_0(t) in survival analysis  
notation.


> For all the [well-written] formula-based modelling functions
> (essentially, those that call model.frame and model.matrix to  
> interpret
> the formula) the option "contrasts" controls how factor
> variables are parameterized in the model matrix.  contr.treatment
> makes the baseline the first factor level, contr.SAS makes
> the baseline the last, contr.sum makes the baseline the mean,
> etc.  E.g.,
>
>> df <- data.frame(time=sin(1:20)+2,
>                   cens=rep(c(0,0,1), len=20),
>                   var1=factor(rep(0:1, each=10)),
>                   var2=factor(rep(0:1, 10)))
>> options(contrasts=c("contr.treatment", "contr.treatment"))
>> coxph(Surv(time, cens) ~ var1 + var2, data=df)
> Call:
> coxph(formula = Surv(time, cens) ~ var1 + var2, data = df)
>
>
>        coef exp(coef) se(coef)      z    p
> var11 0.1640      1.18    0.822 0.1995 0.84
> var21 0.0806      1.08    0.830 0.0971 0.92
>
> Likelihood ratio test=0.05  on 2 df, p=0.974  n= 20, number of  
> events= 6
>> options(contrasts=c("contr.SAS", "contr.SAS"))
>> coxph(Surv(time, cens) ~ var1 + var2, data=df)
> Call:
> coxph(formula = Surv(time, cens) ~ var1 + var2, data = df)
>
>
>         coef exp(coef) se(coef)       z    p
> var10 -0.1640     0.849    0.822 -0.1995 0.84
> var20 -0.0806     0.923    0.830 -0.0971 0.92
>
> Likelihood ratio test=0.05  on 2 df, p=0.974  n= 20, number of  
> events= 6
>> options(contrasts=c("contr.sum", "contr.sum"))
>> coxph(Surv(time, cens) ~ var1 + var2, data=df)
> Call:
> coxph(formula = Surv(time, cens) ~ var1 + var2, data = df)
>
>
>         coef exp(coef) se(coef)       z    p
> var11 -0.0820     0.921    0.411 -0.1995 0.84
> var21 -0.0403     0.960    0.415 -0.0971 0.92
>
> Likelihood ratio test=0.05  on 2 df, p=0.974  n= 20, number of  
> events= 6
>
> (lm() has a contrasts argument that can override  
> getOption("contrasts")
> and set different contrasts for each variable but coxph() does not  
> have
> that argument.)
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org 
>> ] On Behalf Of David Winsemius
>> Sent: Thursday, December 01, 2011 9:36 AM
>> To: a.schlicker at nki.nl
>> Cc: r-help at r-project.org
>> Subject: Re: [R] What's the baseline model when using coxph with  
>> factor variables?
>>
>>
>> On Dec 1, 2011, at 12:00 PM, Andreas Schlicker wrote:
>>
>>> Hi all,
>>>
>>> I'm trying to fit a Cox regression model with two factor variables
>>> but have some problems with the interpretation of the results.
>>> Considering the following model, where var1 and var2 can assume
>>> value 0 and 1:
>>>
>>> coxph(Surv(time, cens) ~ factor(var1) * factor(var2),  data=temp)
>>>
>>> What is the baseline model? Is that considering the whole population
>>> or the case when both var1 and var2 = 0?
>>
>> This has been discussed several times in the past on rhelp. My
>> suggestion would be to search your favorite rhelp archive using
>> "baseline hazard Therneau", since Terry Therneau is the author of
>> survival. (The answer is closer to the first than to the second.)
>>
>>>
>>> Kind regards,
>>> andi
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list