[R] CART vs. Random Forest

Wed Sep 25 21:51:57 CEST 2002

According to Dr. Breiman, the RF should be more accurate
method than a single tree. However, the performance of each 
method seems to depend on the proprotion of outcome variable 
in my case. My data set is a typical classification problem
(predict bad guys). When I ran both of them with different 
proportion of outcome variables(there's a criterion to measure 
the degree of bad behavior), I got very strange results. 

1. proportion of 1 to 0 = 1:4
err.rate of CART = 25.2%
err.rate of RF = 25.6%

2. 1:9 
err.rate of CART = 28.5%
err.rate of RF = 21.2%

3. 1:33
err.rate of CART = 28.2%
err.rate of RF = 12.1%

4. 1:99
err.rate of CART = 25.1%
err.rate of RF = 7.3%

In 3 & 4, RF looks superior to CART. But I'm afraid RF just
vote for "0" to reduce the error rate. Any suggestions? 
Thank you. 

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._