[R] Integer / floating point question

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri May 16 20:28:59 CEST 2008


'numeric' is a class but not a type -- so I think the FAQ is wrongly 
worded but the concept is well defined (despite the presence of 
is.numeric!) But it does not say that all such numbers can be represented 
exactly, and only some can.

On Sat, 17 May 2008, Berwin A Turlach wrote:

> G'day Erik,
>
> On Fri, 16 May 2008 10:45:43 -0500
> Erik Iverson <iverson at biostat.wisc.edu> wrote:
>
> [...]
>> The help page for '%%' addresses this a bit, but then caveats it with
>> 'up to rounding error', which is really my question.  Is there ever
>> 'rounding error' with 2.0 %% 1 as opposed to 2 %% 1?
>
> I am not in the position to give an authoritative answer, but I think
> there should be no problem with rounding error in the situation that
> you describe.  At least I hope there is no problem, otherwise I would
> consider this a serious issue. :)
>
>>>> However, my question is related to R FAQ 7.31, "Why doesn't R
>>>> think these numbers are equal?" The first sentence of that FAQ
>>>> reads, "The only numbers that can be represented exactly in R's
>>>> numeric type are integers and fractions whose denominator is a
>>>> power of 2."
>
> Again, I did not write this FAQ answer and cannot give an authoritative
> answer, but the word "integer" in that answer does not IMHO refer to
> variables in R that are of integer type; in particular since the answer
> discusses what kind of numbers "can be represented exactly in R's
> numeric type". (Perhaps this should actually be plural since there are
> several numeric types?)
>
> My interpretation is that 2.0 and 2 are both *text constants* that
> represent the integer 2, and that number is representable in a floating
> point (and in an integer).
>
> The paper by Goldberg, referenced in FAQ 7.31, contains a discussion on
> whether it is possible (it is) to convert a floating point number
> from binary representation to decimal representation and then back;
> ending up with the same binary representation.  This kind of questions
> are important if you use commands like write.table() or write.csv()
> which write out floating points in decimal representation, readable to
> normal humans.  When you read the data back in, you want to end up with
> the exact same binary representation of the numbers.  Goldberg is
> indeed an interesting paper to read.

Possible to write out and read in *on the same computer*. R doesn't aspire 
to that, as it assumes text files are for humans or transfer to unknown 
other programs, and does provides binary save formats.  (Human-friendly 
numbers will be written out faithfully, but if there is a choice between a 
short representation and a one with sequence of 9s, the short one will be 
chosen.)

> And the comments I made above are based on my understanding of
> Goldberg, 2 and 2.0 are both decimal representation of the integer 2,
> and this number has an exact representation (in integer type variables
> and in floating point type variables).  Hence, both these decimal
> representation should lead to a binary representation that correspond
> to that number exactly.

They should, at least for small integer values.  In R this relies on 
strtod for which C99 and POSIX say

   If the subject sequence has the decimal form and at most DECIMAL_DIG
   (defined in <float.h>) significant digits, the result should be
   correctly rounded [...].

So this means that x.0 for x up to 10^15 or so should be represented 
exactly and be different from x.y for y =1...9.

Note too that non-integer values > 10^16 or so will be represented as 
integers.

Lots of 'should' here -- compiler and runtime writers do make mistakes and 
may not have been working to the C99 standard (e.g. the Windows runtime 
predates it).


> Thus, I would expect
> R> x <- 2.0
> R> x %% 1 == 0
> always to work and to return TRUE.  It is things like:
> R> x <- sqrt(2.0)^2
> R> x %% 1 == 0
> that FAQ 7.31 is about and what, IMHO, the comment in the help page
> of %% warns about; if the variable x contains a value that was created
> by some finite precision floating point calculations.  But the
> conversion from a textual representation of an integer to a binary
> representation should not create problems.
>
> Cheers,
>
> 	Berwin
>
> =========================== Full address =============================
> Berwin A Turlach                            Tel.: +65 6515 4416 (secr)
> Dept of Statistics and Applied Probability        +65 6515 6650 (self)
> Faculty of Science                          FAX : +65 6872 3919
> National University of Singapore
> 6 Science Drive 2, Blk S16, Level 7          e-mail: statba at nus.edu.sg
> Singapore 117546                    http://www.stat.nus.edu.sg/~statba
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list