[R] Running randomForests on large datasets

Liaw, Andy andy_liaw at merck.com
Wed Feb 27 14:24:47 CET 2008

There are a couple of things you may want to try, if you can load the
data into R and still have enough to spare:

- Run randomForest() with fewer trees, say 10 to start with.

- Run randomForest() with nodesize set to something larger than the
default (5 for classification).  This puts a limit on the size of the
trees being grown.  Try something like 21 and see if that runs, and
adjust accordingly.


From: Nagu

> Hi,
> I am trying to run randomForests on a datasets of size 500000X650 and
> R pops up memory allocation error. Are there any better ways to deal
> with large datasets in R, for example, Splus had something like
> bigData library.
> Thank you,
> Nagu
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Notice:  This e-mail message, together with any attachme...{{dropped:15}}

More information about the R-help mailing list