[Rd] strtoi output of empty string inconsistent across platforms
Martin Maechler
m@echler @ending from @t@t@m@th@ethz@ch
Fri Jan 11 09:44:14 CET 2019
>>>>> Michael Chirico
>>>>> on Fri, 11 Jan 2019 14:36:17 +0800 writes:
> Identified as root cause of a bug in data.table:
> https://github.com/Rdatatable/data.table/issues/3267
> On my machine, strtoi("", base = 2L) produces NA_integer_
> (which seems consistent with ?strtoi: "Values which cannot
> be interpreted as integers or would overflow are returned
> as NA_integer_").
indeed consistent with R's documentation on strtoi().
What machine would that be?
> But on all the other machines I've seen, 0L is
> returned. This seems to be consistent with the output of a
> simple C program using the underlying strtol function (see
> data.table link for this program, and for full
> sessionInfo() of some environments with differing output).
> So, what is the correct output of strtoi("", base = 2L)?
> Is the cross-platform inconsistency to be
> expected/documentable?
The inconsistency is certainly undesirable.
The relevant utility function in R's source (<R>/src/main/character.c)
is
static int strtoi(SEXP s, int base)
{
long int res;
char *endp;
/* strtol might return extreme values on error */
errno = 0;
if(s == NA_STRING) return(NA_INTEGER);
res = strtol(CHAR(s), &endp, base); /* ASCII */
if(errno || *endp != '\0') res = NA_INTEGER;
if(res > INT_MAX || res < INT_MIN) res = NA_INTEGER;
return (int) res;
}
and so it clearly is a platform-inconsistency in the underlying C
library's strtol().
I think we should make this cross-platform consistent... and
indeed it make much sense to ensure the result of
strtoi("", base=2L) to become NA_integer_
but changes are that would break code that has relied on the
current behavior {on "all but your computer" ;-)} ?
> Michael Chirico
Thank you for the report,
Martin Maechler
ETH Zurich and R Core Team
More information about the R-devel
mailing list