[R] R on Large Data Sets (again)

Jason Morgan jwm-r-help at skepsi.net
Sun Nov 29 03:18:26 CET 2009


Hello Lars,

On 2009.11.28 18:53:09, Lars Bishop wrote:
> Dear R users,
> 
> I?ve search the R site for help on this topic but it is hard to find a
> precise answer for my questions.
> 
> Which are the best options to overcome the RAM memory limitation problems
> when using R on ?large? data sets (such as 2 or 3 million records)?

I think you'll have to provide a more precise definition of
"large"---are we talking 1 GB of records or 100 GB? Also, it would help
to know what you are trying to do with the data. The documentation for
the biglm and bigmemory packages may provide some help.

> - Is the free available version of R (as opposed to the one provided
> by REvolution Computing) compatible with a windows 64-bit machine?
> And if I increase the RAM memory enough on win-64, would this
> virtually solve my memory limitation problems?

I'm not familiar enough with the commercial version of R, but I do
believe it provides better support for parallelization, which may be of
some help. I don't think, however, that this version will "solve" your
problem.

> - Is a Unix-like platform a better option than win-64? Again, would
> this solve my memory limitation problems?

Possibly, but Win64 should provide plenty of memory (I believe Windows 7
Ultimate can use up to 192 GB of memory). You just have to find the
system that can take that much... With Unix/Linux you can probably cut
back some overhead, and the memory management is most likely better, but
unless you need to go over 192GB of memory, you don't necessarily have
to move to a different platform. 

~Jason

-- 
Jason W. Morgan
Graduate Student
Department of Political Science
*The Ohio State University*
154 North Oval Mall
Columbus, Ohio 43210




More information about the R-help mailing list