[R] R: machine for moderately large data

PIKAL Petr petr.pikal at precheza.cz
Fri Oct 5 18:09:35 CEST 2012


Hi

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Skála, Zdeněk (INCOMA GfK)
> Sent: Friday, October 05, 2012 3:38 PM
> To: r-help at r-project.org
> Subject: [R] R: machine for moderately large data
> 
> Dear all,
> 
> I would like to ask your advice about a suitable computer for the
> following usage.
> I (am starting to) work with moderately big data in R:
> - cca 2 - 20 million rows * 100 - 1000 columns (market basket data)
> - mainly clustering, classification trees, association analysis (e.g.
> libraries rpart, cba, proxy, party)

If I compute correctly, such a big matrix (20e6*1000) needs about 160 GB just to be in memory. Are you prepared for this?

Maybe some suitable database interface shall be preferable.

Regards
Petr

> 
> Can you recommend a sufficient computer for this volume?
> I am routinely working in Windows but feel that Mac or some linux
> machine might be needed.
> 
> Please, respond directly to my email.
> Many thanks!
> 
> Zdenek Skala
> zdenek.skala at gfk.com
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list