[R] Survival Regression with multiple events per subject

Terry Therneau therneau at mayo.edu
Mon Apr 28 15:22:21 CEST 2008


> I want to process a maximum likelihood estimation for a parametric
> regression survival time model with multiple events per subject.

  Data sets with multiple records per subjects are used for several things, you 
need to tell me what it is that you want to accomplish.  Multiple records is a 
method, not a goal.
  
  1. Robust variance: If each observation is a separate measurement on the 
subject, with it's own covariates, time 0,  and endpoint, and you want a "GEE" 
type variance that accounts for the fact that multiple observations are for the 
same subject:
   survreg(Surv(time, exercise) ~ itm + posret + negret + cluster(id), ...
where id is a variable that is unique for unique subjects.
   
  2. Time dependent covariates: Each subject has one endpoint, but covariates 
change over time.  The bookkeeping for time dependent covariates is reasonably 
straightforward for proportional hazards models, but a major pain for an 
accelerated failure time (ACF) model.  I've thought about it but never 
implemented the feature in survreg, though this may change one day due to the 
increased interest in accelerated aging as a biological model among the 
researchers I work with (but don't hold your breath).  For example, if you 
smoked in your youth but later quit, in an ACF model this 'adds years' to your 
biological age which you never lose; the computer code has to keep track of 
covariate histories.  In a proportional hazards model today's risk = 
function(today's covariates), which is easier.  A weibull can be written in 
either ACF or PH form, survreg uses the acf style, I don't know which stata 
uses.

  3. Multiple events per subject, with a single time scale per subject.  This is 
seen in reliability analysis where hazard = function of age.  Survreg does not 
handle this case either.  
  
  	Terry Therneau



More information about the R-help mailing list