[R] Request: Optimum value of cost complexity parameter "k" in "tree" package

Uwe Ligges ligges at statistik.tu-dortmund.de
Thu Apr 2 12:12:30 CEST 2009



Muhammad Azam wrote:
> Dear R community
> I have a question regarding the value of cost complexity parameter "k" used in "tree" package for pruning purpose. Any help in finding the optimum value of "k" is requested. Please give some suggestion in this regard. In the example below i used k=0 but i don't know why? But if i use k=NULL, then it will not plot the resultant tree.  

0 means you have not pruned, higher values will result in "more" 
pruning. Now, which value to choose is up to you: Do you want just a 
reasonably complex tree in order to explain things? Or do you want to 
optimize some performance measure? In the latter case: Which one? 
Misclassification rate? Then you probbaly want to use cross validation 
for different values of k. And so on.

But then, better read a good book about trees. And also not that the 
author of the tree package suggests to use the package "rpart" instead.

Uwe Ligges



> 
> library(tree)
> ds=iris; iris=transform(iris, Species = factor(Species, labels = letters[1:3]))
> miris <- tree(Species ~ ., data = iris, control=tree.control(nobs = 150, minsize = 5, mincut = 2)); iris.prun=prune.tree(miris, method=c("misclass"), best = NULL, k=0); iris.prun; summary(iris.prun); plot(iris.prun)
> 
> 
>  
> best regards
> 
> Muhammad Azam 
> 
> 
> 
>       
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list