[R] How to show which variables include in plot of classification tree

Uwe Ligges ligges at statistik.uni-dortmund.de
Fri Mar 18 19:45:41 CET 2005


Muhammad Subianto wrote:

> Dear all
> For my research, I am learning classification now.
> I was trying some example about classification tree pakages, such as 
> tree and rpart, for instance,
> in Pima.te dataset have 8 variables (include class=type):
> 
> library(rpart)
> library(datasets)
> pima.rpart <- rpart(type ~ npreg+glu+bp+skin+bmi+ped+age,data=Pima.te, 
> method='class')
> plot(pima.rpart, uniform=TRUE)
> text(pima.rpart)
> summary(pima.rpart)
> 
> In the result I found only 5 variables: npreg, glu,  bmi, ped, and age 
> were showing in the plot.
> Now, I have 50 variables in my dataset. The result my classification 
> tree very difficult to know which
> variables showing in the plot. Are there any trick which variables are 
> showing in plot.


1. Please read a good book on classification. Also, you might want to 
take a look into Breiman et al. (1984) cited in ?rpart.

2. rpart does variable selection when growing the tree, so you should 
not expect to find all 50 variables in the plot. See, e.g.,  ?rpart.control

3. You have specified the formula "type ~ npreg + glu + bp + skin + bmi 
+ ped + age", so in particular you cannot expect to get more variables 
than "npreg + glu + bp + skin + bmi + ped + age"

Uwe Ligges






> Thanks for your help.
> Muhammad Subianto
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html




More information about the R-help mailing list