[R] R memory limits on table(x, y) (and bigtabulate)

Robert Zimbardo robertzimbardo at gmail.com
Mon Jul 3 10:57:38 CEST 2017


I have two character vectors x and y that have the following characteristics:

length(x)  # same as
length(y) # 872099

length(unique(x))  # 47740
length(unique(y)) # 52478

I need to crosstabulate them, which would lead to a table with

47740*52478 # 2505299720

cells, which is more than

2^31 # 2147483648

cells, which seems to be R's limit because I am getting the error message

Error in table(x, y) : attempt to make a table with >= 2^31 elements

Two questions:

- is this really R's limit, even on a 64bit machine? It seems like it
(given <https://stat.ethz.ch/R-manual/R-devel/library/base/html/Memory-limits.html>
and <http://www.win-vector.com/blog/2015/06/r-in-a-64-bit-world/>, but
I just want to make sure I understood that right);
- I thought I could handle this with the package bigtabulate, but whenever I run

xy.tab <- bigtable(data.frame(x, y), ccols=1:2)

R crashes as follows:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

Any idea on what I am doing wrong with bigtabulate? Thanks for your
consideration



More information about the R-help mailing list