[R] Memory management

Takatsugu Kobayashi tkobayas at indiana.edu
Sun Sep 16 06:19:40 CEST 2007


Hi,

I apologize again for posting something not suitable on this list.

Basically, it sounds like I should go put this large dataset into a 
database... The dataset I have had trouble with is the transportation 
network of Chicago Consolidated Metropolitan Statistical Area. The 
number of samples is about 7,200 points; and every points have outbound 
and inbound traffic flows: volumes, times, distances, etc. So a quick 
approximation of the number of rows would be
49,000,000 rows (and 249 columns).

This is a text file. I could work with a portion of the data at a time 
like nearest neighbors or pairs of points.

I used read.table('filename',header=F).. I should probably use some bits 
of data at a time instead of putting all at a time...

I am learning RSQLite and RMySQL. As Mr. Wan suggests, I will learn C a 
bit more.....

Thank you very much.

TK

im holtman wrote:
> When you say you can not import 4.8GB, is this the size of the text
> file that you are reading in?  If so, what is the structure of the
> file?  How are you reading in the file ('read.table', 'scan', etc).
>
> Do you really need all the data or can you work with a portion at a
> time?  If so, then consider putting the data in a database and
> retrieving the data as needed.  If all the data is in an object, how
> big to you think this object will be? (# rows, # columns, mode of the
> data).
>
> So you need to provide some more information as to the problem that
> you are trying to solve.
>
> On 9/15/07, tkobayas at indiana.edu <tkobayas at indiana.edu> wrote:
>   
>> Hi,
>>
>> Let me apologize for this simple question.
>>
>> I use 64 bit R on my Fedora Core 6 Linux workstation. A 64 bit R has
>> saved a lot of time. I am sure this is a lot to do with my memory
>> limit, but I cannot import 4.8GB. My workstation has a 8GB RAM, Athlon
>> X2 5600, and 1200W PSU. This PC configuration is the best I could get.
>>
>> I know a bit of C and Perl. Should I use C or Perl to manage this large
>> dataset? or should I even go to 16GB RAM.
>>
>> Sorry for this silly question. But I appreciate if anyone could give me
>> advice.
>>
>> Thank you very much.
>>
>> TK
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>     
>
>
>



More information about the R-help mailing list