[R] Memory Management under Linux: Problems to allocate large amounts of data

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jun 29 15:18:05 CEST 2005


Let's assume this is a 32-bit Xeon and a 32-bit OS (there are 
64-bit-capable Xeons).  Then a user process like R gets a 4GB address 
space, 1GB of which is reserved for the kernel.  So R has a 3GB address 
space, and it is trying to allocate a 2GB contigous chunk.  Because of 
memory fragmentation that is quite unlikely to succeed.

We run 64-bit OSes on all our machines with 2GB or more RAM, for this 
reason.

On Wed, 29 Jun 2005, Dubravko Dolic wrote:

> Dear Group
>
> I'm still trying to bring many data into R (see older postings). After 
> solving some troubles with the database I do most of the work in MySQL. 
> But still I could be nice to work on some data using R. Therefore I can 
> use a dedicated Server with Gentoo Linux as OS hosting only R. This 
> Server is a nice machine with two CPU and 4GB RAM which should do the 
> job:
>
> Dual Intel XEON 3.06 GHz
> 4 x 1 GB RAM PC2100 CL2
> HP Proliant DL380-G3
>
> I read the R-Online help on memory issues and the article on garbage 
> collection from the R-News 01-2001 (Luke Tierney). Also the FAQ and some 
> newsgroup postings were very helpful on understanding memory issues 
> using R.
>
> Now I try to read data from a database. The data I wanted to read 
> consists of 158902553 rows and one field (column) and is of type 
> bigint(20) in the database. I received the message that R could not 
> allocate the 2048000 Kb (almost 2GB) sized vector. As I have 4BG of RAM 
> I could not imagine why this happened. In my understanding R under Linux 
> (32bit) should be able to use the full RAM. As there is not much space 
> used by OS and R as such ("free" shows the use of app. 670 MB after 
> dbSendQuery and fetch) there are 3GB to be occupied by R. Is that 
> correct?

Not really.  The R executable code and the Ncells are already in the 
address space, and this is a virtual memory OS, so the amount of RAM is 
not relevant (it would still be a 3GB limit with 12GB of RAM).

> After that I started R by setting n/vsize explicitly
>
> R --min-vsize=10M --max-vsize=3G --min-nsize=500k --max-nsize=100M
>
>> mem.limits()
>    nsize     vsize
> 104857600        NA
>
> and received the same message.
>
>
> A garbage collection delivered the following information:
>
>> gc()
>         used (Mb) gc trigger   (Mb) limit (Mb)  max used   (Mb)
> Ncells 217234  5.9     500000   13.4       2800    500000   13.4
> Vcells  87472  0.7  157650064 1202.8       3072 196695437 1500.7
>
>
> Now I'm at a loss. Maybe anyone could give me a hint where I should read 
> further or which Information can take me any further

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list