[Rd] 4-int indexing limit of R {Re: [R] allocMatrix limits}

Vadim Kutsyy vadim at kutsyy.com
Fri Aug 1 19:22:43 CEST 2008


Martin Maechler wrote:
> [[Topic diverted from R-help]]
>
> Well, fortunately, reasonable compilers have indeed kept 
> 'long' == 'long int'  to mean 32-bit integers
> ((less reasonable compiler writers have not, AFAIK: which leads
>   of course to code that no longer compiles correctly when
>   originally it did))
> But of course you are right that  64-bit integers
> (typically == 'long long', and really == 'int64') are very
> natural on 64-bit architectures.
> But see below.
>   
well in 64bit Ubunty, /usr/include/limits.h defines:

/* Minimum and maximum values a `signed long int' can hold.  */
#  if __WORDSIZE == 64
#   define LONG_MAX     9223372036854775807L
#  else
#   define LONG_MAX     2147483647L
#  endif
#  define LONG_MIN      (-LONG_MAX - 1L)

and using simple code to test 
(http://home.att.net/~jackklein/c/inttypes.html#int) my desktop, which 
is standard Intel computer, does show.

Signed long min: -9223372036854775808 max: 9223372036854775807

> If you have too large a numeric matrix, it would be larger than
> 2^31 * 8 bytes ~=  2^34 / 2^20 ~= 16'000 Megabytes.
> If that is is 10% only for you,  you'd have around 160 GB of
> RAM.  That's quite a impressive.
>   
 >  cat /proc/meminfo | grep MemTotal
MemTotal:     145169248 kB

We have "smaller" SGI NUMAflex to play with, where the memory can 
increased to 512Gb ("larger" version doesn't have this "limitation").  
But with even commodity hardware you can easily get 128Gb for reasonable 
price (i.e. Dell PowerEdge R900)

> Note that R objects are (pointers to) C structs that are
> "well-defined" platform independently, and I'd say that this
> should remain so.
>
>   
I forgot that R stores two dimensional array in a single dimensional  C 
array. Now I understand why there is a limitation on total number of 
elements.  But this is a big limitations.
> One of the last times this topic came up (within R-core),
> we found that for all the matrix/vector operations,
> we really would need versions of  BLAS / LAPACK that would also
> work with these "big" matrices, ie. such a BLAS/Lapack would
> also have to internally use "longer int" for indexing.
> At that point in time, we had decied we would at least wait to
> hear about the development of such BLAS/LAPACK libraries
BLAS supports two dimensional metrics definition, so if we would store 
matrix as two dimensional object, we would be fine.  But than all R code 
as well as all packages would have to be modified.



More information about the R-devel mailing list