[R] RandomForest question

Weiwei Shi helprhelp at gmail.com
Thu Jul 21 17:16:57 CEST 2005


Hi,
I found the following lines from Leo's randomForest, and I am not sure
if it can be applied here but just tried to help:

mtry0 = the number of variables to split on at each node. Default is
the square root of mdim. ATTENTION! DO NOT USE THE DEFAULT VALUES OF
MTRY0 IF YOU WANT TO OPTIMIZE THE PERFORMANCE OF RANDOM FORESTS. TRY
DIFFERENT VALUES-GROW 20-30 TREES, AND SELECT THE VALUE OF MTRY THAT
GIVES THE SMALLEST OOB ERROR RATE.

mdim is the number of predicators.

HTH,

weiwei

On 7/21/05, Liaw, Andy <andy_liaw at merck.com> wrote:
> > From: Arne.Muller at sanofi-aventis.com
> >
> > Hello,
> >
> > I'm trying to find out the optimal number of splits (mtry
> > parameter) for a randomForest classification. The
> > classification is binary and there are 32 explanatory
> > variables (mostly factors with each up to 4 levels but also
> > some numeric variables) and 575 cases.
> >
> > I've seen that although there are only 32 explanatory
> > variables the best classification performance is reached when
> > choosing mtry=80. How is it possible that more variables can
> > used than there are in columns the data frame?
> 
> It's not.  The code for randomForest.default() has:
> 
>     ## Make sure mtry is in reasonable range.
>     mtry <- max(1, min(p, round(mtry)))
> 
> so it silently sets mtry to number of predictors if it's too large.
> As an example:
> 
> > library(randomForest)
> randomForest 4.5-12
> Type rfNews() to see new features/changes/bug fixes.
> > iris.rf = randomForest(Species ~ ., iris, mtry=10)
> > iris.rf$mtry
> [1] 4
> 
> I should probably add a warning in such cases...
> 
> Andy
> 
> 
> >       thanks for your help
> >       + kind regards,
> >
> >       Arne
> >
> >
> >
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> >
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 


-- 
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III




More information about the R-help mailing list