[R] RandomForest

Vladimir N. Kutinsky kutinskyv at obninsk.com
Wed Aug 20 15:56:48 CEST 2003


Andy,

First of all, thank you for you reply.
I'm using R1.6.1 for Windows. A few days ago I updated the randomForest
package from CRAN. It gives warning messages now that the package was built
under R1.6.2 but seems to work fine.
To make sure we're talking about the same thing, let's take iris
classification as an example.
set.seed(17)
rf<-randomForest(Species~.,iris)
rf$err.rate[1:10]
[1] 0.02000000 0.02666667 0.03333333 0.04666667 0.04666667 0.05333333
[7] 0.05333333 0.06000000 0.06000000 0.05333333
As you can see the forest of 1,2 or 3 trees gives a better predictive
accuracy.

> Because the prediction is based on aggregating the out-of-bag prediction,
> the error rate should be number of misclassified cases divided by the
> number of cases that have been predicted.  When the number of trees is
small, not
> all cases have been out-of-bag, and therefore not all of them have
prediction.

I don't quite understand you, maybe I missed something in the theory of
random forest. Do you mean to say that in order to get an error rate not all
cases of the learning data set are used? Just some number of them taken at
random? Is this number, if it is so, gets larger as the forest grows?

Again thanks,
Vladimir




More information about the R-help mailing list