[R] changes in coxph in "survival" from older version?

Shi, Tao shidaxia at yahoo.com
Tue May 17 01:56:38 CEST 2011


Hi Frank,

I know it's kind of beyond the scope of R help, but I would appreciate it if you 
can elaborate further on this.  Are you worrying about this variable selection 
approach or you think that there is something wrong with the data (it's actually 
a real dataset)?  If it's the first one, I believe I can always narrow down the 
variables based on univarate analysis and build a multivariate model from that.

Many thanks in advance.

...Tao




----- Original Message ----
> From: Frank Harrell <f.harrell at vanderbilt.edu>
> To: r-help at r-project.org
> Sent: Mon, May 16, 2011 11:25:20 AM
> Subject: Re: [R] changes in coxph in "survival" from older version?
> 
> Please don't be serious about doing variable selection with this  dataset.
> Frank
> 
> Shi, Tao wrote:
> > 
> > Hi Terry,
> > 
> > Really appreciate your help!  Sorry for my late reply.
> > 
> > I did realize that there are way more predictors in the model.  My  initial 
> > thinking was use that as an initial model for stepwise model  selection. 
> > Now I 
> > wonder if the model selection result is still  valid if the initial model
> > didn't 
> > even converge?
> > 
> > Thanks!
> > 
> > ...Tao
> > 
> > 
> > 
> > 
> > ----- Original Message ----
> >> From: Terry Therneau <therneau at mayo.edu>
> >> To:  "Shi, Tao" <shidaxia at yahoo.com>
> >> Cc: r-help at r-project.org
> >> Sent:  Thu, May 12, 2011 6:42:09 AM
> >> Subject: Re: changes in coxph in  "survival" from older version?
> >> 
> >> 
> >> On Wed,  2011-05-11 at 16:11 -0700, Shi, Tao wrote:
> >> > Hi all,
> >>  > 
> >> > I found that the two different versions of "survival"  packages, namely  
> >>2.36-5 
> >>
> >> > vs.  2.36-8 or later, give different results for coxph  function. 
> >>  Please see 
> > 
> >> > below and the data is attached.   The  second one was done on Linux, but 
> >>Windows 
> >>
> >> > gave the same results.   Could you please let  me know which one I
> >> should 
> >>trust?
> >> > 
> >> >  Thanks,
> >> 
> >>  In your case,  neither.  Your data set has 22 events and 17  predictors;
> >>  the rule of thumb for a reliable Cox model is 10-20 events per   predictor
> >> which implies no more than 2 for your data set.  As a  result,  the
> >> coefficients of your model have very wide  confidence intervals, the  coef
> >> for Male for instance has se of  3.26, meaning the CI goes from 1/26  to
> >> 26 times the estimate;  i.e., there is no biological meaning to  the
> >>  estimate.
> >> 
> >>   Nevertheless, why did coxph give a  different  answer?  The later
> >> version 2.36-9 failed to  converge (20 iterations)  with a final
> >> log-likelihood of  -19.94, the earlier code converges in 10  iterations to
> >>  -19.91.  In version 2.36-6 an extra check was put into the  maximizer  for
> >> coxph in response to an exceptional data set which caused  the  routine to
> >> fail due to overflow of the exp function; the  Newton-Raphson  iteration
> >> algorithm had made a terrible guess  in it's iteration path, which  can
> >> happen with all NR based  search methods.  
> >>    I put a limit  on the size  the linear predictor in the Cox model of
> >> 21.  The basic   argument is that exp(linear-predictor) = relative risk
> >> for a  subject, and  that there is not much biological meaning for  risks
> >> to be less than exp(-21)  ~ 1/(population of the  earh).  There is more to
> >> the reasoning,  interested  parties should look at the comments in
> >> src/coxsafe.c, a 5 line   routine with 25 lines of discussion.  I will
> >> happily accept  input the  "best" value for the constant.
> >> 
> >>     I never expected to see a data set  with both convergence of the  LL
> >> and linear predictors larger than +-15.   Looking at the fit  (older code)
> >> > round(fit2$linear.predictor, 2)
> >>    [1]   2.26   0.89   4.96 -19.09 -12.10   1.39     2.82   3.10
> >>  [9]  18.57 -25.25  22.94    8.75   5.52  -27.64  14.88 -23.41
> >> [17]  13.70  -28.45  -1.84   10.04  12.62   2.54   6.33   -8.76
> >> [25]   9.68    4.39   2.92    3.51   6.02 -17.24   5.97
> >> 
> >> This says   that, if the model is to be believed, you have several near
> >>  immortals in the  data set. (Everyone else on earth will perish  first).
> >> 
> >> Terry  Therneau
> >> 
> >> 
> >> 
> >>
> > 
> >  ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the  posting guide
> > http://www.R-project.org/posting-guide.html
> > and  provide commented, minimal, self-contained, reproducible code.
> > 
> 
> 
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt  University
> --
> View this message in context:  
>http://r.789695.n4.nabble.com/changes-in-coxph-in-survival-from-older-version-tp3516101p3527017.html
>
> Sent  from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting  guide http://www.R-project.org/posting-guide.html
> and provide commented,  minimal, self-contained, reproducible code.
>



More information about the R-help mailing list