[R] model selection, stepAIC(), and coxph() (fwd)
Thomas Lumley
tlumley at u.washington.edu
Mon May 8 17:11:35 CEST 2006
On Sat, 6 May 2006, Chad Reyhan Bhatti wrote:
> Hello,
>
> My question concerns model selection, stepAIC(), add1(), and coxph().
>
> In Venables and Ripley (3rd Ed) pp389-390 there is an example of using
> stepAIC() for the automated selection of a coxph model for VA lung cancer
> data.
>
> A statistics question: Can partial likelihoods be interpreted in the same
> manner as likelihoods with respect to information based criterion and
> likelihood ratio tests? It seems that they should be treated as
> quasilikelihoods which would make stepAIC() invalid and would require the
> use of add1() with a F-test for the reduction in deviance.
Since this is a question about the MASS book you would be better off
contacting the authors.
They do (as usual) know what they are doing. The Cox model is an
unusually (perhaps uniquely) well-behaved semiparametric model, and the
partial likelihood really does behave this way.
- For data without ties in the survival time the partial likelihood is
(proportional to) the marginal likelihood of the ranks, so it is a
perfectly good parametric likelihood. (Kalbfleisch & Prenctice,
Biometrika, 1973)
- The chi^2 distribution (rather than F distribution) for the likelihood
ratio test is justified by the marginal likelihood, or by martingale
arguments (eg the book by Fleming and Harrington), or in more modern times
by empirical process arguments or as a semiparametric profile likelihood.
However, the only technically hard part is showing weak convergence -- the
original paper by Cox showed that the variance of the partial score and
the Hessian of the partial likelihood were the same, which is the key fact
for the chi^2 rather than F test to be valid (if one of them is)
- The same arguments suggest AIC will be appropriate for comparing
different subsets of variables in the same way that it is for generalized
linear models. I don't have a reference here.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-help
mailing list