[R] memory problem for R

Liaw, Andy andy_liaw at merck.com
Fri Jan 30 04:22:23 CET 2004


Have you read the posting guide for R-help?  

You need to tell us more: What hardware/OS/version of R are you using?

A rough calculation on storage needed:
> 6e5 * 70 * 8 / 1024^2
[1] 320.4346

So you need 320+ MB of RAM just to store the data as a matrix of doubles in
R.  You need enough RAM to make a couple of copies of this.  If any of the
variables are factors, the requirement goes up even more, as the design
matrix used to fit the model will expand the factors into columns of
contrasts.  How much physical RAM do you have on the computer?

There are more efficient ways to fit the model to data of this size, but you
need to be able to at least fit the data into memory.  There have been a few
suggestions on R-help before on how to do this, so do search the archive.
(I believe Prof. Koenker had a web page describing how to do this with mySQL
and updating the X'X matrix by reading in data in chunks.)

Andy

> From: Yun-Fang Juan
> 
> Hi, 
> I try to use lm to fit a linear model with 600k rows and 70 
> attributes.
> But I can't even load the data into the R environment. 
> The error message says the vector memory is used up. 
> 
> Is there anyone having experience with large datasets in R? (I bet)
> 
> Please advise. 
> 
> thanks,
> 
> Yun-Fang 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}




More information about the R-help mailing list