[R] double precision
Duncan Murdoch
murdoch at stats.uwo.ca
Wed Aug 19 20:40:49 CEST 2009
On 8/19/2009 1:49 PM, miller_2555 wrote:
>
> Roger Bivand wrote:
>>
>> On Tue, 5 Dec 2006, Yoni Schamroth wrote:
>>
>>> Hi,
>>>
>>> I am attempting to query a data frame from a mysql database.
>>> One of the variables is a unique identification number ("numeric") 18
>>> digits
>>> long.
>>> I am struggling to retrieve this variable exactly without any rounding.
>>
>> Read it as a character - a double is a double:
>>
>>> x <- 6527600583317876352
>>> y <- 6527600583317876380
>>> all.equal(x,y)
>> [1] TRUE
>>> storage.mode(x)
>> [1] "double"
>>
>> and why they are equal is a FAQ (only ~16 digits in a double). Integer is
>> 4-byte. Since they are IDs, not to be used for math, leave them as
>> character strings - which they are, like telephone numbers.
>>
>
> Resurrecting this post for a moment, the same issue arose when interfacing R
> with a Postgres database using the bigint data type (a signed 64-bit integer
> ranging from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 as of
> this writing). While the underlying cause is summarized above, I'd like to
> recommend the inclusion of a 64-bit integer data type into the R base. For
> performance reasons, I use R to independently generate a unique transaction
> ID that is equivalent to the Postgres-generated bigint (with some
> consistency checks -- generally bad design, but vastly quicker than
> querying the database for the same value). I currently generate a string
> representation and pass that to the DBI, though the process is cumbersome
> and likely not as efficient as an arithmetic equivalent (particularly when
> using a 64-bit processor architecture). Furthermore, there are additional
> gyrations that need to occur when querying the database for bigint values.
> Do significant practical challenges exist in the implementation of a 64-bit
> integer that would outweigh the faster and cleaner compatibility with
> database backends?
I believe the C99 standard doesn't require that a 64 bit signed integer
type exist (only one that is 64 bits or more), so that would likely
cause some headaches. And we may still use some compilers that are not
C99 compliant, which may not have any type that big.
But an even bigger problem is that there is a lot of type-specific code
in R. Adding another primitive type like a 64 bit signed integer would
mean writing arithmetic routines for that type and deciding how it
interacts with all the other numeric types. For example: what if you
add a floating point double to a 64 bit int? Normally adding a double
to an int coerces the result to double. But double isn't big enough to
hold a 64 bit int exactly. So doing something like x + 1 could lose
precision in x.
So I imagine this will happen eventually, but it will not be easy, and
it probably won't happen soon.
Duncan Murdoch
More information about the R-help
mailing list