[R] RandomForest question

Liaw, Andy andy_liaw at merck.com
Thu Jul 21 18:59:33 CEST 2005


See the tuneRF() function in the package for an implementation of 
the strategy recommended by Breiman & Cutler.

BTW, "randomForest" is only for the R package.  See Breiman's 
web page for notice on trademarks.

Andy

> From: Weiwei Shi 
> 
> Hi,
> I found the following lines from Leo's randomForest, and I am not sure
> if it can be applied here but just tried to help:
> 
> mtry0 = the number of variables to split on at each node. Default is
> the square root of mdim. ATTENTION! DO NOT USE THE DEFAULT VALUES OF
> MTRY0 IF YOU WANT TO OPTIMIZE THE PERFORMANCE OF RANDOM FORESTS. TRY
> DIFFERENT VALUES-GROW 20-30 TREES, AND SELECT THE VALUE OF MTRY THAT
> GIVES THE SMALLEST OOB ERROR RATE.
> 
> mdim is the number of predicators.
> 
> HTH,
> 
> weiwei
> 
> On 7/21/05, Liaw, Andy <andy_liaw at merck.com> wrote:
> > > From: Arne.Muller at sanofi-aventis.com
> > >
> > > Hello,
> > >
> > > I'm trying to find out the optimal number of splits (mtry
> > > parameter) for a randomForest classification. The
> > > classification is binary and there are 32 explanatory
> > > variables (mostly factors with each up to 4 levels but also
> > > some numeric variables) and 575 cases.
> > >
> > > I've seen that although there are only 32 explanatory
> > > variables the best classification performance is reached when
> > > choosing mtry=80. How is it possible that more variables can
> > > used than there are in columns the data frame?
> > 
> > It's not.  The code for randomForest.default() has:
> > 
> >     ## Make sure mtry is in reasonable range.
> >     mtry <- max(1, min(p, round(mtry)))
> > 
> > so it silently sets mtry to number of predictors if it's too large.
> > As an example:
> > 
> > > library(randomForest)
> > randomForest 4.5-12
> > Type rfNews() to see new features/changes/bug fixes.
> > > iris.rf = randomForest(Species ~ ., iris, mtry=10)
> > > iris.rf$mtry
> > [1] 4
> > 
> > I should probably add a warning in such cases...
> > 
> > Andy
> > 
> > 
> > >       thanks for your help
> > >       + kind regards,
> > >
> > >       Arne
> > >
> > >
> > >
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> > >
> > >
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> > 
> 
> 
> -- 
> Weiwei Shi, Ph.D
> 
> "Did you always know?"
> "No, I did not. But I believed..."
> ---Matrix III
> 
> 
>




More information about the R-help mailing list