[R] Cox model approximaions (was "comparing SAS and R survival....)

Sun Jul 24 13:49:01 CEST 2011

On Fri, Jul 22, 2011 at 2:04 PM, Terry Therneau <therneau at mayo.edu> wrote:
>  For time scale that are truly discrete Cox proposed the "exact partial
> likelihood".

Or "the method of partial likelihood" applied to the discrete logistic model,

> I call that the "exact" method and SAS calls it the
> "discrete" method.  What we compute is precisely the same, however they
> use a clever algorithm which is faster.

Note that the model to estimate here is discrete. The "base-line"
conditional probabilities at each failure time are eliminated through
the partial likelihood argument. This can also be described as a
conditional logistic regression, where we condition on the total
number of failures in each risk set (thus eliminating the
risk-set-specific parameters). Suppose that in a risk set of size  n
there are  d  failures. This method must then consider all possible
ways of choosing  d  failures out of  n  at risk, or choose(n, d)
cases. This makes the computational burden huge with lots of ties.

The method "ml" in "coxreg" (package 'eha') uses a different approach.
Instead of conditional logistic regression it performs unconditional
logistic regression by adding one parameter per risk set. In principle
this is possible to do with 'glm' after expanding the data set with
"toBinary" in 'eha', but with large data sets and lots of risk sets,
glm chokes. Instead, with the "ml" approach in "coxreg", the extra
parameters just introduced are eliminated by profiling them out! This
leads to a fast estimation procedure, compared to the abovementioned
'exact'  methods. A final note: with "ml", the logistic regression
uses the cloglog link, to be compatible with the situation when data
really are continuous but grouped, and a proportional hazards model
holds.
(Interestingly, conditional inference is usually used to simplify
things; here it creates computational problems not present without
conditioning.)

>  To make things even more
> confusing, Prentice introduced an "exact marginal likelihood" which is
> not implemented in R, but which SAS calls the "exact" method.

This is not so confusing if we realize that we now are in the
continuous time model. Then, with a risk set of size  n  with  d
failures, we must consider all possible permutations of the  d
failures, or  d!  cases. That is, here we assume that ties occur
because of imprecise measurement and that there is one true ordering.
This method calculates an average contribution to the partial
likelihood. (Btw, you refer to "Prentice", but isn't this from the
Biometrika paper by Kalbfleisch & Prentice (1973)? And of course their
classical book?)

>  Data is usually not truly discrete, however.  More often ties are the
> result of imprecise measurement or grouping.  The Efron approximation
> assumes that the data are actually continuous but we see ties because of
> this; it also introduces an approximation at one point in the
> calculation which greatly speeds up the computation; numerically the
> approximation is very good.

Note that both Breslow's and Efron's approximations are approximations
of the "exact marginal likelihood".

>  In spite of the irrational love that our profession has for anything
> branded with the word "exact", I currently see no reason to ever use
> that particular computation in a Cox model.

Agreed; but only because it is so time consuming. The unconditional
logistic regression with profiling is a good alternative.

> I'm not quite ready to
> remove the option from coxph, but certainly am not going to devote any
> effort toward improving that part of the code.
>
>  The Breslow approximation is less accurate, but is the easiest to
> program and therefore was the only method in early Cox model programs;
> it persists as the default in many software packages because of history.
> Truth be told, unless the number of tied deaths is quite large the
> difference in results between it and the Efron approx will be trivial.
>
>  The worst approximation, and the one that can sometimes give seriously
> strange results, is to artificially remove ties from the data set by
> adding a random value to each subject's time.

Maybe, but randomly breaking ties may not be a bad idea; you could
regard that as getting an (unbiased?) estimator of the
exact (continuous-time) partial likelihood. Expanding: Instead of
going through all possible permutations, why not take a random sample
of size greater than one?

Göran

> Terry T
>
>
> --- begin quote --
> I didn't know precisely the specifities of each approximation method.
> I thus came back to section 3.3 of Therneau and Grambsch, Extending the
> Cox
> Model. I think I now see things more clearly. If I have understood
> correctly, both "discrete" option and "exact" functions assume "true"
> discrete event times in a model approximating the Cox model. Cox partial
> likelihood cannot be exactly maximized, or even written, when there are
> some
> ties, am I right ?
>
> In my sample, many of the ties (those whithin a single observation of
> the
> process) are due to the fact that continuous event times are grouped
> into
> intervals.
>
> So I think the logistic approximation may not be the best for my problem
> despite the estimate on my real data set (shown on my previous post) do
> give
> interessant results regarding to the context of my data set !
> I was thinking about distributing the events uniformly in each interval.
> What do you think about this option ? Can I expect a better
> approximation
> than directly applying Breslow or Efron method directly with the grouped
> event data ? Finally, it becomes a model problem more than a
> computationnal
> or algorithmic one I guess.
>
>
>
>

-- 
Göran Broström