[R] importing a VERY LARGE database from Microsoft SQL into R

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jan 25 08:47:59 CET 2006


On Tue, 24 Jan 2006, roger bos wrote:

> This question comes up a number of times what most people will tell you is
> that even if you get all you data into R you won't be able to do anything
> with it.  By that I mean, you need about 3 or 4 times as much memory as the
> size of your data object because R will need to create copies of it.
>
> I can tell you what I do in case it helps.  I also have a SQL Server
> database and the good thing about having the data in that format is that you
> probably don't need all of the data all of the time. First of all, a windows
> machine can handle up to 4GB of RAM, but most software cannot use all of

Or even up to 32Gb of RAM.  But a single process is limited to 3Gb of user 
address space, and to 2Gb unless you tell the OS to allow more.

> it by default.  I have 4GB and I also use the windows binary, so that means
> that whenever I download a new version of R, I have to modify the header
> file to the it LARGEADDRESSAWARE.  Using this trick, I can load up my big
> matrix into R to the point where task manager shows that R is using about
> 1.7GB of memory.  Despite such large objects, I am able to do regressions
> and other things with the data, so I am quite happy.  If you need more
> details just let me know.

This is the default in the current R 2.2.1, but you also need to set the 
/3GB flag in Windows.  I found the limit was about 1.7Gb without doing 
that, more like 2.5Gb when I did.

The details are in the rw-FAQ Q2.9 (and vary by R version, so please 
consult that in the version of R you use).

>
>
>
>
>
>
>
> On 1/24/06, r user <ruser2006 at yahoo.com> wrote:
>>
>> I am using R 2.1.1 in a Windows Xp environment.
>>
>> I need to import a large database from Microsoft SQL
>> into R.
>>
>> I am currently using the "sqlQuery" function/command.
>>
>> This works, but I sometimes run out of memory if my
>> database is too big, or it take quite a long time for
>> the data to import into R.
>>
>> Is there a better way to bring a large SQL database
>> into R?
>>
>> IS there an efficient way to convert the data into R
>> format prior to bringing it into R? (E.g. directly
>> from Microsoft SQL?)
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list