[R] questions on rpart (tree changes when rearrange the order of covariates?!)

Uwe Ligges ligges at statistik.tu-dortmund.de
Wed May 13 11:30:36 CEST 2009



Yuanyuan wrote:
> Greetings,
> 
> I am using rpart for classification with "class" method. The test data  is
> the Indian diabetes data from package mlbench.
> 
> I fitted a classification tree firstly using the original data, and then
> exchanged the order of Body mass and Plasma glucose which are the
> strongest/important variables in the growing phase. The second tree is a
> little different from the first one. The misclassification tables are
> different too. I did not change the data, but why the results are so
> different?

Well, at some splits the variable that comes first and yields in the 
same reduction of the entropy criterion as another one might be used, 
hence another result.

Uwe Ligges




> 
> Does anyone know how rpart deal with ties?
> 
> Here is the codes for running the two trees.
> 
> 
> library(mlbench)
> data(PimaIndiansDiabetes2)
> mydata<-PimaIndiansDiabetes2
> library(rpart)
> fit2<-rpart(diabetes~., data=mydata,method="class")
> plot(fit2,uniform=T,main="CART for original data")
> text(fit2,use.n=T,cex=0.6)
> printcp(fit2)
> table(predict(fit2,type="class"),mydata$diabetes)
> ## misclassifcation table: rows are fitted class
>       neg pos
>   neg 437  68
>   pos  63 200
> #Klimt(fit2,mydata)
> 
> pmydata<-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)])
> fit3<-rpart(diabetes~., data=pmydata,method="class")
> plot(fit3,uniform=T,main="CART after exchaging mass & glucose")
> text(fit3,use.n=T,cex=0.6)
> printcp(fit3)
> table(predict(fit3,type="class"),pmydata$diabetes)
> ##after exchage the order of BODY mass and PLASMA glucose
>       neg pos
>   neg 436  64
>   pos  64 204
> #Klimt(fit3,pmydata)
> 
> 
> Thanks,
> 
> 
> --------------------------------------------------------------------------------------
> Yuanyuan Huang
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list