[R] memory limit

Henrik Bengtsson hb at stat.berkeley.edu
Thu Nov 27 01:07:33 CET 2008


On Wed, Nov 26, 2008 at 1:16 PM, Stavros Macrakis <macrakis at alum.mit.edu> wrote:
> I routinely compute with a 2,500,000-row dataset with 16 columns,
> which takes 410MB of storage; my Windows box has 4GB, which avoids
> thrashing.  As long as I'm careful not to compute and save multiple
> copies of the entire data frame (because 32-bit Windows R is limited
> to about 1.5GB address space total, including any intermediate
> results), R works impressively well and fast with this dataset for
> selections, calculations, cross-tabs, plotting, etc.  For example,
> simple single-column statistics and cross-tabs take << 1 sec., summary
> of the whole thing takes 16 sec. A linear regression between two
> numeric columns takes < 20 sec. Plotting of all 2.5M points takes a
> while, but that is no surprise (and is usually pointless [sic]
> anyway). I have not tried to do any compute-intensive statistical
> calculations on the whole data set.
>
> The main (but minor) annoyance with it is that it takes about 90 secs
> to load into memory using R's native binary "save" format, so I tend
> to keep the process lying around rather than re-starting and
> re-loading for each analysis. Fortunately, garbage collection is very
> effective in reclaiming unused storage as long as I'm careful to
> remove unnecessary objects.

FYI, objects saved with save(..., compress=FALSE) are notable faster
to read back.

/Henrik

>
>            -s
>
>
> On Wed, Nov 26, 2008 at 7:42 AM, iwalters <iwalters at cellc.co.za> wrote:
>>
>> I'm currently working with very large datasets that consist out of 1,000,000
>> + rows.  Is it at all possible to use R for datasets this size or should I
>> rather consider C++/Java.
>>
>>
>> --
>> View this message in context: http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20699700.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list