[Rd] strtoi output of empty string inconsistent across platforms

Martin Maechler m@echler @ending from @t@t@m@th@ethz@ch
Fri Jan 11 19:00:12 CET 2019

>>>>> Martin Maechler 
>>>>>     on Fri, 11 Jan 2019 09:44:14 +0100 writes:

>>>>> Michael Chirico 
>>>>>     on Fri, 11 Jan 2019 14:36:17 +0800 writes:

    >> Identified as root cause of a bug in data.table:
    >> https://github.com/Rdatatable/data.table/issues/3267

    >> On my machine, strtoi("", base = 2L) produces NA_integer_
    >> (which seems consistent with ?strtoi: "Values which
    >> cannot be interpreted as integers or would overflow are
    >> returned as NA_integer_").

    > indeed consistent with R's documentation on strtoi().
    > What machine would that be?

    >> But on all the other machines I've seen, 0L is
    >> returned. This seems to be consistent with the output of
    >> a simple C program using the underlying strtol function
    >> (see data.table link for this program, and for full
    >> sessionInfo() of some environments with differing
    >> output).

    >> So, what is the correct output of strtoi("", base = 2L)?

    >> Is the cross-platform inconsistency to be
    >> expected/documentable?

    > The inconsistency is certainly undesirable.  The relevant
    > utility function in R's source (<R>/src/main/character.c)
    > is

    > static int strtoi(SEXP s, int base) { long int res; char
    > *endp;

    >     /* strtol might return extreme values on error */
    > errno = 0;

    >     if(s == NA_STRING) return(NA_INTEGER); res =
    > strtol(CHAR(s), &endp, base); /* ASCII */ if(errno ||
    > *endp != '\0') res = NA_INTEGER; if(res > INT_MAX || res <
    > INT_MIN) res = NA_INTEGER; return (int) res; }

    > and so it clearly is a platform-inconsistency in the
    > underlying C library's strtol().

(corrected typos here: )

    > I think we should make this cross-platform consistent ... 
    > and indeed it makes much sense to ensure the result of

    >     strtoi("", base=2L)    to become   NA_integer_

    > but chances are that would break code that has relied on
    > the current behavior {on "all but your computer" ;-)} ?

I still think that such a change should be done.

'make check all' on the R source (+ Recommended packages) seems
not to signal any error or warning with such a change, so I plan
to commit that change to "the trunk" / "R-devel" soon, unless
concerns are raised highly (and quickly enough).


More information about the R-devel mailing list