[R] R: machine for moderately large data

Ista Zahn istazahn at gmail.com
Fri Oct 5 19:41:58 CEST 2012


On Fri, Oct 5, 2012 at 12:09 PM, PIKAL Petr <petr.pikal at precheza.cz> wrote:
> Hi
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of Skála, Zdeněk (INCOMA GfK)
>> Sent: Friday, October 05, 2012 3:38 PM
>> To: r-help at r-project.org
>> Subject: [R] R: machine for moderately large data
>>
>> Dear all,
>>
>> I would like to ask your advice about a suitable computer for the
>> following usage.
>> I (am starting to) work with moderately big data in R:
>> - cca 2 - 20 million rows * 100 - 1000 columns (market basket data)
>> - mainly clustering, classification trees, association analysis (e.g.
>> libraries rpart, cba, proxy, party)
>
> If I compute correctly, such a big matrix (20e6*1000) needs about 160 GB just to be in memory. Are you prepared for this?

This is not as outrageous as one might think -- you can get a mac pro
with 32 gigs of memory for around $3,500

Best,
Ista

>
> Maybe some suitable database interface shall be preferable.
>
> Regards
> Petr
>
>>
>> Can you recommend a sufficient computer for this volume?
>> I am routinely working in Windows but feel that Mac or some linux
>> machine might be needed.
>>
>> Please, respond directly to my email.
>> Many thanks!
>>
>> Zdenek Skala
>> zdenek.skala at gfk.com
>>
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list