[Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements
Serguei Sokol
sokol at insa-toulouse.fr
Thu Apr 19 11:47:20 CEST 2018
Le 19/04/2018 à 09:30, Tomas Kalibera a écrit :
> On 04/19/2018 02:06 AM, Duncan Murdoch wrote:
>> On 18/04/2018 5:08 PM, Tousey, Colton wrote:
>>> Hello,
>>>
>>> I want to report a bug in R that is limiting my capabilities to
>>> export a matrix with write.csv or write.table with over
>>> 2,147,483,648 elements (C's int limit). I found this bug already
>>> reported about before:
>>> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17182. However,
>>> there appears to be no solution or fixes in upcoming R version
>>> releases.
>>>
>>> The error message is coming from the writetable part of the utils
>>> package in the io.c source
>>> code(https://svn.r-project.org/R/trunk/src/library/utils/src/io.c):
>>> /* quick integrity check */
>>> if(XLENGTH(x) != (R_len_t)nr * nc)
>>> error(_("corrupt matrix -- dims not not match
>>> length"));
>>>
>>> The issue is that nr*nc is an integer and the size of my matrix, 2.8
>>> billion elements, exceeds C's limit, so the check forces the code to
>>> fail.
>>
>> Yes, looks like a typo: R_len_t is an int, and that's how nr was
>> declared. It should be R_xlen_t, which is bigger on machines that
>> support big vectors.
>>
>> I haven't tested the change; there may be something else in that
>> function that assumes short vectors.
> Indeed, I think the function won't work for long vectors because of
> EncodeElement2 and EncodeElement0. EncodeElement2/0 would have to be
> changed, including their signatures
That would be a definite fix but before such deep rewriting is
undertaken may the following small fix (in addition to "(R_xlen_t)nr *
nc") will be sufficient for cases where nr and nc are in int range but
their product can reach long vector limit:
replace
tmp = EncodeElement2(x, i + j*nr, quote_col[j], qmethod,
&strBuf, sdec);
by
tmp = EncodeElement2(VECTOR_ELT(x, (R_xlen_t)i + j*nr), 0,
quote_col[j], qmethod,
&strBuf, sdec);
Serguei
More information about the R-devel
mailing list