[R] large data set, error: cannot allocate vector

Robert Citek rwcitek at alum.calberkeley.org
Fri May 5 17:24:51 CEST 2006


Why am I getting the error "Error: cannot allocate vector of size  
512000 Kb" on a machine with 6 GB of RAM?

I'm playing with some large data sets within R and doing some simple  
statistics.  The data sets have 10^6 and 10^7 rows of numbers.  R  
reads in and performs summary() on the 10^6 set just fine.  However,  
on the 10^7 set, R halts with the error.  My hunch is that somewhere  
there's an setting to limit some memory size to 500 MB.  What setting  
is that, can it be increased, and if so how?  Googling for the error  
has produced lots of hits but none with answers, yet.  Still browsing.

Below is a transcript of the session.

Thanks in advance for any pointers in the right direction.

Regards,
- Robert
http://www.cwelug.org/downloads
Help others get OpenSource software.  Distribute FLOSS
for Windows, Linux, *BSD, and MacOS X with BitTorrent

-------

$ uname -sorv ; rpm -q R ; R --version
Linux 2.6.11-1.1369_FC4smp #1 SMP Thu Jun 2 23:08:39 EDT 2005 GNU/Linux
R-2.3.0-2.fc4
R version 2.3.0 (2006-04-24)
Copyright (C) 2006 R Development Core Team

$ wc -l dataset.010MM.txt
10000000 dataset.010MM.txt

$ head -3 dataset.010MM.txt
15623
3845
22309

$ wc -l dataset.100MM.txt
100000000 dataset.100MM.txt

$ head -3 dataset.100MM.txt
15623
3845
22309

$ cat ex3.r
options(width=1000)
foo <- read.delim("dataset.010MM.txt")
summary(foo)
foo <- read.delim("dataset.100MM.txt")
summary(foo)

$ R < ex3.r

R > foo <- read.delim("dataset.010MM.txt")

R > summary(foo)
      X15623
Min.   :    1
1st Qu.: 8152
Median :16459
Mean   :16408
3rd Qu.:24618
Max.   :32766

R > foo <- read.delim("dataset.100MM.txt")
Error: cannot allocate vector of size 512000 Kb
Execution halted

$ free -m
              total       used       free     shared    buffers      
cached
Mem:          6084       3233       2850          0          
20         20
-/+ buffers/cache:       3193       2891
Swap:         2000       2000          0




More information about the R-help mailing list