[R] memory problems when combining randomForests
Eleni Rapsomaniki
e.rapsomaniki at mail.cryst.bbk.ac.uk
Fri Jul 28 10:44:20 CEST 2006
Hi Andy,
> > I'm using R (windows) version 2.1.1, randomForest version 4.15.
> ^^^^^^^^^^^^^^^^^^^^^^^^^
> Never seen such a version...
Ooops! I meant 4.5-15
> > I then save each tree to a file so I can combine them all
> > afterwards. There are no memory issues when
> > keep.forest=FALSE. But I think that's the bit I need for
> > future predictions (right?).
>
> Yes, but what is your question? (Do you mean each *forest*,
> instead of each *tree*?)
I mean the component of the object that is created from randomForest that has
the name "forest" (and takes up all the memory!).
> > A bit off the subject, but should the order at which at rows
> > (ie. sets of explanatory variables) are passed to the
> > randomForest function affect the result? I have noticed that
> > if I pick a random unordered sample from my control data for
> > training the error rate is much lower than if I a take an
> > ordered sample. This remains true for all my cross-validation
> > results.
>
> I'm not sure I understand. In randomForest() (as in other
> functions) variables are in columns, rather than rows, so
> are you talking about variables (columns) in different order
> or data (rows) in different order?
Yes, sorry I confused you. I mean the order at which data (rows) is passed, not
columns.
Finally, I see from
http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#inter
that there is a component in Breiman's implementation of randomForest that
computes interactions between parameters. Has this been implemented in R yet?
Many thanks for your time and help.
Eleni Rapsomaniki
More information about the R-help
mailing list