[Rd] 4-int indexing limit of R {Re: [R] allocMatrix limits}
Vadim Kutsyy
vadim at kutsyy.com
Fri Aug 1 19:22:43 CEST 2008
Martin Maechler wrote:
> [[Topic diverted from R-help]]
>
> Well, fortunately, reasonable compilers have indeed kept
> 'long' == 'long int' to mean 32-bit integers
> ((less reasonable compiler writers have not, AFAIK: which leads
> of course to code that no longer compiles correctly when
> originally it did))
> But of course you are right that 64-bit integers
> (typically == 'long long', and really == 'int64') are very
> natural on 64-bit architectures.
> But see below.
>
well in 64bit Ubunty, /usr/include/limits.h defines:
/* Minimum and maximum values a `signed long int' can hold. */
# if __WORDSIZE == 64
# define LONG_MAX 9223372036854775807L
# else
# define LONG_MAX 2147483647L
# endif
# define LONG_MIN (-LONG_MAX - 1L)
and using simple code to test
(http://home.att.net/~jackklein/c/inttypes.html#int) my desktop, which
is standard Intel computer, does show.
Signed long min: -9223372036854775808 max: 9223372036854775807
> If you have too large a numeric matrix, it would be larger than
> 2^31 * 8 bytes ~= 2^34 / 2^20 ~= 16'000 Megabytes.
> If that is is 10% only for you, you'd have around 160 GB of
> RAM. That's quite a impressive.
>
> cat /proc/meminfo | grep MemTotal
MemTotal: 145169248 kB
We have "smaller" SGI NUMAflex to play with, where the memory can
increased to 512Gb ("larger" version doesn't have this "limitation").
But with even commodity hardware you can easily get 128Gb for reasonable
price (i.e. Dell PowerEdge R900)
> Note that R objects are (pointers to) C structs that are
> "well-defined" platform independently, and I'd say that this
> should remain so.
>
>
I forgot that R stores two dimensional array in a single dimensional C
array. Now I understand why there is a limitation on total number of
elements. But this is a big limitations.
> One of the last times this topic came up (within R-core),
> we found that for all the matrix/vector operations,
> we really would need versions of BLAS / LAPACK that would also
> work with these "big" matrices, ie. such a BLAS/Lapack would
> also have to internally use "longer int" for indexing.
> At that point in time, we had decied we would at least wait to
> hear about the development of such BLAS/LAPACK libraries
BLAS supports two dimensional metrics definition, so if we would store
matrix as two dimensional object, we would be fine. But than all R code
as well as all packages would have to be modified.
More information about the R-devel
mailing list