[R] Reading huge chunks of data from MySQL into Windows R
Duncan Murdoch
murdoch at stats.uwo.ca
Mon Jun 6 15:49:22 CEST 2005
On 6/6/2005 9:30 AM, Dubravko Dolic wrote:
> Dear List,
>
>
>
> I'm trying to use R under Windows on a huge database in MySQL via ODBC
> (technical reasons for this...). Now I want to read tables with some
> 160.000.000 entries into R. I would be lucky if anyone out there has
> some good hints what to consider concerning memory management. I'm not
> sure about the best methods reading such huge files into R. for the
> moment I spilt the whole table into readable parts stick them together
> in R again.
>
>
>
> Any hints welcome.
Most values in R are stored in 8 byte doubles, so 160,000,000 entries
will take roughly a gigabyte of storage. (Half that if they are
integers or factors.) You are likely to run into problems manipulating
something that big in Windows, because users are normally only allowed 2
GB of the memory address space, and it can be fragmented.
I'd suggest developing algorithms that can work on the data a block at a
time, so that you never need to stick the whole thing together in R at
once. Alternatively, switch to a 64 bit platform and install lots of
memory -- but there are still various 4 GB limits in R, so you may still
run into trouble.
Duncan Murdoch
More information about the R-help
mailing list