[R] big data file versus ram memory

Stephan Kolassa Stephan.Kolassa at gmx.de
Thu Dec 18 21:07:46 CET 2008


Hi Mauricio,

Mauricio Calvao schrieb:
> 1) I would like very much to use R for processing some big data files 
> (around 1.7 or more GB) for spatial analysis, wavelets, and power 
> spectra estimation; is this possible with R? Within IDL, such a big data 
> set seems to be tractable...

There are some packages to handle large datasets, e.g., bigmemoRy. There 
were a couple of presentations on various ways to work with large 
datasets at the last useR conference - take a look at the presentations at
http://www.statistik.uni-dortmund.de/useR-2008/
You'll probably be most interested in the "High Performance" streams.

> 2) I have heard/read that R "puts all its data on ram"? Does this really 
> mean my data file cannot be bigger than my ram memory?

The philosophy is basically to use RAM. Anything working outside RAM is 
not exactly heretical to R, but it does require some additional effort.

> 3) If I have a big enough ram, would I be able to process whatever data 
> set?? What constrains the practical limits of my data sets??

 From what I understand - little to nothing, beyond the time needed for 
computations.

HTH,
Stephan



More information about the R-help mailing list