[R-sig-hpc] Decision trees in R with big data

Simon Yansen Zhao simonyansenzhao at gmail.com
Wed Apr 16 05:05:55 CEST 2014


The R package [wsrf](http://cran.r-project.org/package=wsrf) may be a 
choice if you want to try.  We are trying our best to make it capable 
for handling big data.  However, there is still room for improvements.

Hope this helps!


-- 
Regards,
Simon Yansen Zhao


On 2014-04-15 18:00, r-sig-hpc-request at r-project.org wrote:
 > Message: 1
 > Date: Mon, 14 Apr 2014 11:53:26 -0500
 > From: Supriya Jain <sjsjsj2009 at gmail.com>
 > To: r-sig-hpc at r-project.org
 > Subject: [R-sig-hpc] Decision trees in R with big data
 > Message-ID:
 > 	<CAPc8pCJugsYbMuFEGHFPy9wjg-zn9dVQzx7rJSEckLVeiciGFg at mail.gmail.com>
 > Content-Type: text/plain
 >
 > Hi,
 >
 > I have successfully used rpart but with a few thousands rows, and a few
 > hundred input attributes. When using data with ~2 million rows 
(instances),
 > and ~20,000 input attributes (typical data sizes in my application), 
I get
 > memory problems when using rpart.
 >
 > Does anyone know of a Decision tree algorithm that works in R with big
 > data?
 >
 > Thanks!
 >
 > 	[[alternative HTML version deleted]]
 >



More information about the R-sig-hpc mailing list