[R] Integer / floating point question

Fri May 16 19:44:05 CEST 2008

On 5/16/2008 11:45 AM, Erik Iverson wrote:
> Marc -
> 
> Marc Schwartz wrote:
>> on 05/16/2008 09:56 AM Erik Iverson wrote:
>>> Dear R-help -
>>>
>>> I have thought about this question for a bit, and come up with no 
>>> satisfactory answer.
>>>
>>> Say I have the numeric vector t1, given as
>>>
>>> t1 <- c(1.0, 1.5, 2.0, 2.5, 3.0)
>>>
>>> I simply want to reliably extract the unique integers from t1, i.e., 
>>> the vector c(1, 2, 3).  This is of course superficially simple to 
>>> carry out.
>> 
>> Use modulo division:
>> 
>>  > t1[t1 %% 1 == 0]
>> [1] 1 2 3
>> 
>> or
>> 
>>  > unique(t1[t1 %% 1 == 0])
>> [1] 1 2 3
> 
> Yes, that is one of the solutions.  However, can I be sure that, say,
> 
> 2.0 %% 1 == 0
> 
> The help page for '%%' addresses this a bit, but then caveats it with 
> 'up to rounding error', which is really my question.  Is there ever 
> 'rounding error' with 2.0 %% 1 as opposed to 2 %% 1?

If you enter them as part of your source, then 2.0 and 2 are guaranteed 
to be the same number, because both are exactly representable as the 
ratio of an integer and a power of 2:  2/2^0, or 1/2^(-1).  (There are 
limits on the range of both the numerator and denominator for this to 
work, but they are quite wide.)

If you calculate them, e.g. as 0.2*10, then there is no guarantee, and 
the results may vary from machine to machine.  This is because 0.2 is 
*not* representable as an integer over a power of two.  It will likely 
be represented to 52 or 53 bit precision, but with some 
compiler/hardware combinations, you might get 64 bit (or other) 
precision in intermediate results.  I don't think R currently does this, 
but I wouldn't be very surprised if there were situations where it did.

There might be cases where R doesn't correctly convert literal numeric 
constants into the closest floating point value, but I think it would be 
considered a serious bug if it messed up small integers.

Duncan Murdoch

> 
>> 
>>> However, my question is related to R FAQ 7.31, "Why doesn't R think 
>>> these numbers are equal?" The first sentence of that FAQ reads, "The 
>>> only numbers that can be represented exactly in R's numeric type are 
>>> integers and fractions whose denominator is a power of 2."
>>>
>>> All the methods I've devised to do the above task seem to ultimately 
>>> rely on the fact that identical(x.0, x) == TRUE, for integer x.
>>>
>>> My assumption, which I'm hoping can be verified, is that, for example, 
>>> 2.0 (when, say, entered at the prompt and not computed from an 
>>> algorithm) is an integer in the sense of FAQ 7.31.
>>>
>>> This seems to be the case on my machine.
>>>
>>>  > identical(2.0, 2)
>>> [1] TRUE
>>>
>>> Apologies that this is such a trivial question, it seems so obvious on 
>>> the surface, I just want to be sure I am understanding it correctly.
>> 
>> Keep in mind that by default and unless specifically coerced to integer, 
>> numbers in R are double precision floats:
>> 
>>  > is.integer(2)
>> [1] FALSE
>> 
>>  > is.numeric(2)
>> [1] TRUE
>> 
>>  > is.integer(2.0)
>> [1] FALSE
>> 
>>  > is.numeric(2.0)
>> [1] TRUE
>> 
>> 
>> So:
>> 
>>  > identical(2.0, as.integer(2))
>> [1] FALSE
>> 
>> 
>> Does that help?
> 
> A bit, and this is the source of my confusion.  Can I always assume that 
> 2.0 == 2 when the class of each is 'numeric'?
> 
>> 
>> Marc Schwartz
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.