[R] survival: ridge log-likelihood workaround

Wed Dec 15 00:54:31 CET 2010

You ask some good questions. 

> I would like to clarify statistics that ridge coxph returns. Here is
> my understanding and please correct me where I am wrong.

> 1) In his paper Gray [JASA 1992] suggests a Wald-type statistics with
> the formula for degree of freedom. The summary function for ridge
> coxph returns the degree of freedom and Wald test which are
> equivalent to what Gray wrote.

Not quite.  The printout contains "se" and "se2".  The first is the
suggestion
of Verwiej and Van Houwlingen and the second is the suggestion of Gray.
The first error estimate is always larger although usually very close to
the
second.  However, I have seen several data sets where the second
collapses and
produces a tiny variance estimate.  Thus the Wald statistic is based on 
the Verweij estimate.

> 2) Summary for ridge coxph prints a likelihood ratio test which is
>  not, if I may say, a proper likelihood ratio test. First it is
>  based on unpenalized log-likelihoods and the above defined degree
>  of freedom is used. I accept that there is nothing which suggests
>  that one should use penalized log-likelihoods. However, there is
>  also nothing published which suggests that unpenalized
>  log-likelihoods should be used with the above defined degree of
>  freedom. I have found that Therneau&Grambsch in their excellent
>  book discuss this in a paragraph and mention that the p-value thus
>  returned is somewhat conservative (p too large). Therefore, the
>  likelihood ratio test that ridge coxph returns is not a true one
>  and the statistics returned (i.e. 2*(loglik(beta)-loglik(0))) has
>  the distribution which is somewhat more compact than the
>  chi-square. I like conservative p-values and like to be on a safe
>  side. However, in my work Wald test p-values for ridge regression
>  are much higher than the "likelihood ratio test's" p-value and I
>  don't get impression that they are conservative.

There is literature, but it's been 10 years and I don't remember the
references.  One key issue, not documented I admit, is that for the
difference in LR between two models to be valid at all, they have to
have
the same penalty parameter.  I have almost never compared two penalized
LR to each other, likely because I rarely use ridge().  More often I am
comparing the LR of a penalized model to one without the penalty term, 
for which the issue does not arise.
  The same type of issue will arise, I believe, if one were to fit two
mixed effects models that differed only in the fixed effects, and then
used an LR test.  Comparisons are only valid if two have the same
variance
values, or else the models are not nested.

> 3) There is no efficient score test for ridge regression, as there is
no 
> penalized efficient score test. That is OK.

> 4) "Rsquare" and "max possible" are returned and I have so far failed
to 
> find exact references for them.
  I was sure that was somewhere, but I can't find it either.  The manual
page
summary.coxph.object fell off my list somewhere.
  The R^2 value is that of Nagelkirke.  At the time I wrote that code it
was
a simple estimate of R^2.  There have since been several other
proposals, and
a recent review by Schemper suggests that the Nagelkirke method may be
nearly
the worst of the bunch.  Replacing this is another item on my to-do
list. 

> I would like to add a note here that the coxph algorithm works really
> fast with high-dimensional covariates and all compliments to people
> who developed it.  There was evidently a lot of effort put to make it
> all work fast and correctly and, in my opinion, it is a shame that the
> last bit - summary for ridge coxph - is a bit, if I may say, shaky. In
> my opinion, only Wald test and degree of freedom as defined by Gray
> deserve to be part of summary for ridge coxph and I look forward to be
> corrected. On my behalf I am prepared in my spare time to write the
> code so that the summary for ridge coxph does not return NULL and that
> the statistics printed in summary for ridge coxph are based on
> published papers.

This is the hard issue.  The "penalized methods" addition to coxph was 
designed so that a new method could be added without any changes at all
to the coxph (or survreg) code.  The primary reason was to allow users
to
add arbitrary methods.  It turned out that no one ever has done so, at
least
none that I have ever heard of, so that particular design goal now looks
silly.
  A consequnce of the design is that the basic printouts for the model
consist of a constant piece + an addition bit supplied by the inserted
method.  There are others: the user has to know whether a particular
penalty plug in is appropriate for survreg, coxph, or both.  This is
true
for pspline and ridge, but most of the parameter combinations for
frailty
only work with coxph.  It is in the documentation for fraily, but has
nevertheless caused many a user to stumble.  
  I think that it should be rewritten, but this is a formidable task.
Note: the "people who developed" the survival package is --- me, and
there
are only so many hours in a day.  As I look ahead to retirement (about
10
years) it clearly is time to enlist more people in this project.

Terry Therneau