[R] RandomForest question

Uwe Ligges ligges at statistik.uni-dortmund.de
Thu Jul 21 16:31:36 CEST 2005


Arne.Muller at sanofi-aventis.com wrote:

> Hello,
> 
> I'm trying to find out the optimal number of splits (mtry parameter)
> for a randomForest classification. The classification is binary and
> there are 32 explanatory variables (mostly factors with each up to 4
> levels but also some numeric variables) and 575 cases.
> 
> I've seen that although there are only 32 explanatory variables the
> best classification performance is reached when choosing mtry=80. How
> is it possible that more variables can used than there are in columns
> the data frame?

If some of the variables are factors, dummy variables are generated and 
you get a larger number of variables in the later process.

Uwe Ligges


> thanks for your help + kind regards,
> 
> Arne
> 
> 
> 
> 
> [[alternative HTML version deleted]]
> 
> ______________________________________________ 
> R-help at stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
> posting guide! http://www.R-project.org/posting-guide.html




More information about the R-help mailing list