[Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements

Serguei Sokol sokol at insa-toulouse.fr
Thu Apr 19 11:47:20 CEST 2018


Le 19/04/2018 à 09:30, Tomas Kalibera a écrit :
> On 04/19/2018 02:06 AM, Duncan Murdoch wrote:
>> On 18/04/2018 5:08 PM, Tousey, Colton wrote:
>>> Hello,
>>>
>>> I want to report a bug in R that is limiting my capabilities to 
>>> export a matrix with write.csv or write.table with over 
>>> 2,147,483,648 elements (C's int limit). I found this bug already 
>>> reported about before: 
>>> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17182. However, 
>>> there appears to be no solution or fixes in upcoming R version 
>>> releases.
>>>
>>> The error message is coming from the writetable part of the utils 
>>> package in the io.c source 
>>> code(https://svn.r-project.org/R/trunk/src/library/utils/src/io.c):
>>> /* quick integrity check */
>>>                  if(XLENGTH(x) != (R_len_t)nr * nc)
>>>                      error(_("corrupt matrix -- dims not not match 
>>> length"));
>>>
>>> The issue is that nr*nc is an integer and the size of my matrix, 2.8 
>>> billion elements, exceeds C's limit, so the check forces the code to 
>>> fail.
>>
>> Yes, looks like a typo:  R_len_t is an int, and that's how nr was 
>> declared.  It should be R_xlen_t, which is bigger on machines that 
>> support big vectors.
>>
>> I haven't tested the change; there may be something else in that 
>> function that assumes short vectors.
> Indeed, I think the function won't work for long vectors because of 
> EncodeElement2 and EncodeElement0. EncodeElement2/0 would have to be 
> changed, including their signatures

That would be a definite fix but before such deep rewriting is 
undertaken may the following small fix (in addition to "(R_xlen_t)nr * 
nc") will be sufficient for cases where nr and nc are in int range but 
their product can reach long vector limit:

replace
     tmp = EncodeElement2(x, i + j*nr, quote_col[j], qmethod,
                     &strBuf, sdec);
by
     tmp = EncodeElement2(VECTOR_ELT(x, (R_xlen_t)i + j*nr), 0, 
quote_col[j], qmethod,
                     &strBuf, sdec);

Serguei



More information about the R-devel mailing list