[R] rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)

Xiaogang Su xiaogangsu at gmail.com
Wed Dec 15 00:03:37 CET 2010


Hi, Tal,

Here is a quick way of getting around. First create two responses via
dummy variables

y1 <- ifelse(y=="a", 1, 0)
y2 <- ifelse(y=="b", 1, 0)

and then built two separate tree models for y1 and y2 separately.

Hope it helps.
Xiaogang


On Tue, Dec 14, 2010 at 8:33 AM, Tal Galili <tal.galili at gmail.com> wrote:
> Hi dear R-help memebers,
>
> When building a CART model (specifically classification tree) using rpart,
> it is sometimes obvious that there are variables (X's) that are meaningful
> for predicting some of the outcome (y) variables - while other predictors
> are relevant for other outcome variables (y's only).
>
> *How can it be estimated, which explanatory variable is "used" for which of
> the predicted value in the outcome variable?*
>
> Here is an example code in which x2 is the only important variable for
> predicting "b" (one of the y outcomes). There is no predicting variable for
> "c", and x1 is a predictor for "a", assuming that x2 permits it.
>
> How can this situation be shown using the an rpart fitted model?
>
> N <- 200
> set.seed(5123)
>
> x1 <- runif(N)
>
> x2 <- runif(N)
>
> x3 <- runif(N)
>
> y <- sample(letters[1:3], N, T)
>
> y[x1 <.5] <- "a"
>
> y[x2 <.1] <- "b"
>
> fit <- rpart(y ~ x1+x2)
>
> fit2 <- prune(fit, cp= 0.07)
>
> plot(fit2)
>
> text(fit2, use.n=TRUE)
>
> Thanks,
>
> Tal
>
>
>
> ----------------Contact
> Details:-------------------------------------------------------
> Contact me: Tal.Galili at gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
> ----------------------------------------------------------------------------------------------
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
==============================
Xiaogang Su, Ph.D.
Associate Professor, Statistician
School of Nursing, University of Alabama
Birmingham, AL 35294-1210
(205) 934-2355 [Office]
xgsu at uab.edu
xiaogangsu at gmail.com
http://homepage.uab.edu/xgsu/



More information about the R-help mailing list