[Rd] behavior of as.integer("5000000000")
Martin Maechler
maechler at lynne.stat.math.ethz.ch
Fri Apr 17 17:24:04 CEST 2015
>>>>> Martin Maechler <maechler at lynne.stat.math.ethz.ch>
>>>>> on Fri, 17 Apr 2015 15:49:35 +0200 writes:
>>>>> Hervé Pagès <hpages at fredhutch.org>
>>>>> on Mon, 13 Apr 2015 23:36:14 -0700 writes:
>> On 04/13/2015 11:32 PM, Martin Maechler wrote:
>>>
>>>> Hi,
>>>> > as.integer("5000000000")
>>>> [1] 2147483647
>>>> Warning message:
>>>> inaccurate integer conversion in coercion
>>>
>>>> > as.integer("-5000000000")
>>>> [1] NA
>>>> Warning message:
>>>> inaccurate integer conversion in coercion
>>>
>>>> Is this a bug or a feature? The man page suggests it's the
>>>> latter:
>>>
>>> I think you mean the "former", a bug.
>>>
>>> and I agree entirely, see the following " 2 x 2 " comparison :
>>>
>>> > N <- 5000000000000 * 8^-(0:7)
>>> > as.integer(N)
>>> [1] NA NA NA NA 1220703125 152587890 19073486 2384185
>>> Warning message:
>>> NAs introduced by coercion
>>> > as.integer(-N)
>>> [1] NA NA NA NA -1220703125 -152587890 -19073486
>>> [8] -2384185
>>> Warning message:
>>> NAs introduced by coercion
>>> > as.integer(as.character(N))
>>> [1] 2147483647 2147483647 2147483647 2147483647 1220703125 152587890 19073486 2384185
>>> Warning message:
>>> inaccurate integer conversion in coercion
>>> > as.integer(as.character(-N))
>>> [1] NA NA NA NA -1220703125 -152587890 -19073486
>>> [8] -2384185
>>> Warning message:
>>> inaccurate integer conversion in coercion
>>>
>>>
>>>
>>>> ‘as.integer’ attempts to coerce its argument to be of integer
>>>> type. The answer will be ‘NA’ unless the coercion succeeds.
>>>
>>>> even though someone could always argue that coercion of "5000000000"
>>>> succeeded (for some definition of "succeed").
>>>
>>>> Also is there any reason why the warning message is different than
>>>> with:
>>>
>>>> > as.integer(-5000000000)
>>>> [1] NA
>>>> Warning message:
>>>> NAs introduced by coercion
>>>
>>>> In the case of as.integer("-5000000000"), it's not really that the
>>>> conversion was "inaccurate", it's a little bit worse than that. And
>>>> knowing that NAs where introduced by coercion is important.
>>>
>>> Yes.
>>> The message is less a problem than the bug, but I agree we
>>> should try to improve it.
>> Sounds good. Thanks Martin,
> I've committed a change to R-devel now, such that also this case
> returns NA with a warning, actually for the moment with both the
> old warning and the 'NAs introduced by coercion' warning.
> The "nice thing" about the old warning is that it explicitly
> mentions integer coercion.
> I currently think we should keep that property, and I'd propose
> to completely drop the
> "inaccurate integer conversion in coercion"
> warning (it is not used anywhere else currently) and replace it
> in this and other as.integer(.) cases with
> 'NAs introduced by integer coercion'
> (or something similar. ... improvements / proposals are welcome).
Replying to myself:
I've found
'NAs introduced by coercion to integer range'
to be even more "on spot", and so will commit it for today.
Of course, amendment proposals are still welcome.
Martin
> BTW, the fact that as.integer("-5000000000") did produce an NA
> instead of -2147483647 so it would have been compatible with as.integer("5000000000")
> was just another coincidence, namely that we "currently" code NA_integer_
> by INT_MIN (for 32 bit integers, INT_MIN = 2147483648 = 2^31)
> [[but your C code must not rely on that, it is an implementation detail!]]
> Martin
More information about the R-devel
mailing list