[R] Classification Trees

Vladimir N. Kutinsky kutinskyv at obninsk.com
Sat Jun 9 15:23:07 CEST 2001


I apologize if you receive multiple copies of this letter. This is the first time I've written to this mailing list, so please be kind:-)
 
Hello everyone!
I'm trying to make a programme which grows a classification tree. I use APL programming language and I use R to compare and test results.
I have a classification tree and I have a sequence of cost-comlexity parameters(alphas): 0,A1,A2...An. Now I want to choose a right-sized tree or, in other words, the optimal complexity parameter Ak. I understand that I should use a V-fold cross validation. The problem is that I don't quite understand how to prune trees in CV:
1. If I use the initial sequence of alphas: 
To test A1 I snip off all rooted nodes with cost-complexity parameters in a range [0, A1]; to test A2 I prune all nodes with cost-complexity parameters in a range [A1, A2]; ...etc. Is this correct?

2. If I use a new sequence of complexity parameters 0,B1,B2,...,Bm, where Bi is the geometric mean of A[i] and A[i+1], Bi=SQRT( A[i] * A[i+1] ): 
Suppose, I select Bk as an optimal parameter. Which Ai does this optimal Bk correspond to? 

Which of the two ways should I follow? Are there any other ways of choosing a right-sized tree? Does anybody have any ideas?
Thank you

Kutinsky Vladimir
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://stat.ethz.ch/pipermail/r-help/attachments/20010609/a181e4b0/attachment.html


More information about the R-help mailing list