[R] Classification and Regression Tree for Survival Analysis

Achim Zeileis Achim.Zeileis at uibk.ac.at
Tue Jun 13 22:28:52 CEST 2017

On Tue, 13 Jun 2017, Dimitrie Siriopol via R-help wrote:

> I am trying to use the CART in a survival analysis. I have three variables of interest (all 3 ordinal - x, y and z, each of them with 5 categories) from which I want to make smaller groups (just an example 1st category from X variable with the 2nd and 3rd categories from the Y category and 2, 3 and 4 categories from the Z category etc) based on their, let's say, association with mortality.
> Now I would also want that this analysis to be adjusted for a number of variables (that I don't want to incorporate in the decision tree, just to take them into consideration in the relationship between the 3 variables and the outcome; I would also want to mention that for this confounders I have missing values - how should this be deal with?), this survival analysis to be stratified and also to use clusters.
> I have tried party and rpart packages, but I don't seem to get how to properly do what I want.

I don't think that such an analysis is available "out of the box". In 
principle, you can iterate between (a) estimating a survival regression 
with the confounders - given the groups from the tree, and (b) estimating 
the tree - given an offset in the survival regression for the confounders. 
Such a strategy is implemented in the palmtree() function from the 
"partykit" package - however only for lm() and glm() models, not for 
survreg(). But the same idea could be applied in that case as well, e.g., 
using a Weibull distribution.

For incorporating stratification/clustering one could either use clustered 
inference in the variable selection or add some random effect. For lm/glm 
this is provided in the package "glmertree" but I don't think there are 
readily available code blocks to do the same for a survival response.

And as for the missing values in the confounders: I can't think of a good 
strategy for this. One could try generic imputation strategies but it's 
rather unlikely that this does not affect the subsequent regression plus 
tree selection process.

References for palmtree and glmertree:

> Thank you
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list