[R] Survival Regression with multiple events per subject
Terry Therneau
therneau at mayo.edu
Mon Apr 28 15:22:21 CEST 2008
> I want to process a maximum likelihood estimation for a parametric
> regression survival time model with multiple events per subject.
Data sets with multiple records per subjects are used for several things, you
need to tell me what it is that you want to accomplish. Multiple records is a
method, not a goal.
1. Robust variance: If each observation is a separate measurement on the
subject, with it's own covariates, time 0, and endpoint, and you want a "GEE"
type variance that accounts for the fact that multiple observations are for the
same subject:
survreg(Surv(time, exercise) ~ itm + posret + negret + cluster(id), ...
where id is a variable that is unique for unique subjects.
2. Time dependent covariates: Each subject has one endpoint, but covariates
change over time. The bookkeeping for time dependent covariates is reasonably
straightforward for proportional hazards models, but a major pain for an
accelerated failure time (ACF) model. I've thought about it but never
implemented the feature in survreg, though this may change one day due to the
increased interest in accelerated aging as a biological model among the
researchers I work with (but don't hold your breath). For example, if you
smoked in your youth but later quit, in an ACF model this 'adds years' to your
biological age which you never lose; the computer code has to keep track of
covariate histories. In a proportional hazards model today's risk =
function(today's covariates), which is easier. A weibull can be written in
either ACF or PH form, survreg uses the acf style, I don't know which stata
uses.
3. Multiple events per subject, with a single time scale per subject. This is
seen in reliability analysis where hazard = function of age. Survreg does not
handle this case either.
Terry Therneau
More information about the R-help
mailing list