[Rd] Converting non-32-bit integers from python to R to use bit64: reticulate
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Sat Jun 1 18:29:01 CEST 2019
>>>>> Juan Telleria Ruiz de Aguirre
>>>>> on Thu, 30 May 2019 18:46:29 +0200 writes:
>Thank you Gabriel for valuable insights on the 64-bit integers topic.
>In addition, my statement was wrong, as Python3 seems to have unlimited
>(and variable) size integers.
....
If you are interested in using unlimited size integers, you
could use the CRAN R package 'gmp' which builds on the GMP = GNU
MP = GNU Multi Precision C library.
https://cran.r-project.org/package=gmp
(and for arbitrary precision "floats", see CRAN pkg 'Rmpfr'
built on package gmp, and both the GNU C libraries GMP and
MPFR:
https://cran.r-project.org/package=Rmpfr
)
>Division between Int-32 and Int-64 seems to only happen in Python2.
>Best,
>Juan
>El miércoles, 29 de mayo de 2019, Gabriel Becker <gabembecker using gmail.com>
>escribió:
>> Hi Juan,
>>
>> Comments inline.
>>
>> On Wed, May 29, 2019 at 12:48 PM Juan Telleria Ruiz de Aguirre <
>> jtelleria.rproject using gmail.com> wrote:
>>
>>> Dear R Developers,
>>>
>>> There is an interesting issue related to "reticulate" R package which
>>> discusses how to convert Python's non-32 bit integers to R, which has had
>>> quite an exhaustive discussion:
>>>
>>> https://github.com/rstudio/reticulate/issues/323
>>>
>>> Python seems to handle integers differently from R, and is dependant on
>>> the
>>> system arquitecture: On 32 bit systems uses 32-bit integers, and on 64-bit
>>> systems uses 64-bit integers.
>>>
>>> So my question is:
>>>
>>> As regards R's C Interface, how costly would it be to convert INTSXP from
>>> 32 bits to 64 bits using C, on 64 bits Systems? Do the benefits surpass
>>> the
>>> costs? And should such development be handled from within R Core /
>>> Ordinary
>>> Members , or it shall be left to package maintainers?
>>>
>>
>> Well, I am not an R-core member, but I can mention a few things:
>>
>> 1. This seems like it would make the results of R code non-reproducible
>> between 32 and 64bit versions of R; at least some code would give different
>> results (at the very least in terms of when integer values overflow to NA,
>> which is documented behavior).
>> 2. Obviously all integer data would take twice as much memory, memory
>> bandwidth, space in caches, etc, even when it doesn't need it.
>> 3. Various places treat data /data pointers coming out of INTSXP and
>> LGLSXP objects the same within the internal R sources (as currently they're
>> both int/int*). Catching and fixing all those wouldn't be impossible, but
>> it would take at least some doing.
>>
>> For me personally 1 seems like a big problem, and 3 makes the conversion
>> more work than it might have seemed initially.
>>
>> As a related side note, as far as I understand what I've heard from R-core
>> members directly, the choice to not have multiple types of integers is
>> intentional and unlikely to change.
>>
>> Best,
>> ~G
>>
>>
>>
>>
>>>
>>> Thank you! :)
More information about the R-devel
mailing list