[R] big data file versus ram memory

David Winsemius dwinsemius at comcast.net
Thu Dec 18 21:17:19 CET 2008


On Dec 18, 2008, at 3:07 PM, Stephan Kolassa wrote:

> Hi Mauricio,
>
> Mauricio Calvao schrieb:
>> 1) I would like very much to use R for processing some big data  
>> files (around 1.7 or more GB) for spatial analysis, wavelets, and  
>> power spectra estimation; is this possible with R? Within IDL, such  
>> a big data set seems to be tractable...
>
> There are some packages to handle large datasets, e.g., bigmemoRy.  
> There were a couple of presentations on various ways to work with  
> large datasets at the last useR conference - take a look at the  
> presentations at
> http://www.statistik.uni-dortmund.de/useR-2008/
> You'll probably be most interested in the "High Performance" streams.
>
>> 2) I have heard/read that R "puts all its data on ram"? Does this  
>> really mean my data file cannot be bigger than my ram memory?
>
> The philosophy is basically to use RAM. Anything working outside RAM  
> is not exactly heretical to R, but it does require some additional  
> effort.
>
>> 3) If I have a big enough ram, would I be able to process whatever  
>> data set?? What constrains the practical limits of my data sets??
>
> From what I understand - little to nothing, beyond the time needed  
> for computations.

Er, ... it depends. At a minimum a person considering this should have  
read the FAQs. If this is a question about Windows, then R-Win FAQ 2.9:

http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021

There has been quite a bit about this in the list over the last couple  
of years. Search the archives:
http://search.r-project.org/

-- 
David Winsemius

>
>
> HTH,
> Stephan
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list