[R] memory problems when combining randomForests

Eleni Rapsomaniki e.rapsomaniki at mail.cryst.bbk.ac.uk
Fri Jul 28 10:44:20 CEST 2006


Hi Andy, 

> > I'm using R (windows) version 2.1.1, randomForest version 4.15. 
>                                        ^^^^^^^^^^^^^^^^^^^^^^^^^ 
> Never seen such a version...
Ooops! I meant 4.5-15
 
> > I then save each tree to a file so I can combine them all 
> > afterwards. There are no memory issues when 
> > keep.forest=FALSE. But I think that's the bit I need for 
> > future predictions (right?). 
> 
> Yes, but what is your question?  (Do you mean each *forest*,
> instead of each *tree*?)
I mean the component of the object that is created from randomForest that has
the name "forest" (and takes up all the memory!). 

> > A bit off the subject, but should the order at which at rows 
> > (ie. sets of explanatory variables) are passed to the 
> > randomForest function affect the result? I have noticed that 
> > if I pick a random unordered sample from my control data for 
> > training the error rate is much lower than if I a take an 
> > ordered sample. This remains true for all my cross-validation 
> > results. 
> 
> I'm not sure I understand.  In randomForest() (as in other
> functions) variables are in columns, rather than rows, so
> are you talking about variables (columns) in different order 
> or data (rows) in different order?

Yes, sorry I confused you. I mean the order at which data (rows) is passed, not
columns.

Finally, I see from
http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#inter

that there is a component in Breiman's implementation of randomForest that
computes interactions between parameters. Has this been implemented in R yet?

Many thanks for your time and help.
Eleni Rapsomaniki



More information about the R-help mailing list