[R] Interesting quirk with fractions and rounding

Fri Apr 21 23:03:43 CEST 2017

I've been following this thread with interest. A nice
collection of things to watch out for, if you don't
want the small arithmetic errors due to finite-length
digital representations of fractions to cause trouble!

However, as well as these small discrepancies, major
malfunctions can also result.

Back on Dec 22, 2013, I posted a Christmas Greetings
message to R-help:

  Season's Greetings (and great news ... )!

which starts:

  Greetings All!
  With the Festive Season fast approaching, I bring you joy
  with the news (which you will surely wish to celebrate)
  that R cannot do arithmetic!

  Usually, this is manifest in a trivial way when users report
  puzzlement that, for instance,

    sqrt(pi)^2 == pi
    # [1] FALSE

  which is the result of a (generally trivial) rounding or
  truncation error:

     sqrt(pi)^2 - pi
    # [1] -4.440892e-16

  But for some very simple calculations R goes off its head.

And the example given is:

  Consider a sequence generated by the recurrence relation

    x[n+1] = 2*x[n] if 0 <= x[n] <= 1/2
    x[n+1] = 2*(1 - x[n]) if 1/2 < x[n] <= 1

  (for 0 <= x[n] <= 1).

  This has equilibrium points (x[n+1] = x[n]) at x[n] = 0
  and at x[n] = 2/3:

    2/3 -> 2*(1 - 2/3) = 2/3

  It also has periodic points, e.g.

    2/5 -> 4/5 -> 2/5 (period 2)
    2/9 -> 4/9 -> 8/9 -> 2/9 (period 3)

  The recurrence relation can be implemented as the R function

    nextx <- function(x){
      if( (0<=x)&(x<=1/2) ) {x <- 2*x} else {x <- 2*(1 - x)}
    }

  Now have a look at what happens when we start at the equilibrium
  point x = 2/3:

    N <- 1 ; x <- 2/3
    while(x > 0){
      cat(sprintf("%i: %.9f\n",N,x))
      x <- nextx(x) ; N <- N+1
    }
    cat(sprintf("%i: %.9f\n",N,x))

For a while [run it and see!], this looks as though it's doing what
the arithmetic would lead us to expect: the first 24 results will all
be printed as 0.666666667, which looks fine as 2/3 to 9 places.

But then the "little errors" start to creep in:

  N=25: 0.666666666
  N=28: 0.666666672 
  N=46: 0.667968750
  N=47: 0.664062500
  N=48: 0.671875000
  N=49: 0.656250000
  N=50: 0.687500000
  N=51: 0.625000000
  N=52: 0.750000000
  N=53: 0.500000000
  N=54: 1.000000000
  N=55: 0.000000000

  What is happening is that, each time R multiplies by 2, the binary
  representation is shifted up by one and a zero bit is introduced
  at the bottom end.

At N=53, the first binary bit of 'x' is 1, and all the rest are 0,
so now 'x' is exactly 0.5 = 1/2, hence the final two are also exact
results; 53 is the Machine$double.digits = 53 binary places.

So this normally "almost" trivial feature can, for such a simple
calculation, lead to chaos or catastrophe (in the literal technical
sense).

For more detail, including an extension of the above, look at the
original posting in the R-help archives for Dec 22, 2013:

  From: (Ted Harding) <Ted.Harding at wlandres.net>
  Subject: [R] Season's Greetings (and great news ... )!
  Date: Sun, 22 Dec 2013 09:59:16 -0000 (GMT)

(Apologies, but I couldn't track down the URL for this posting
in the R-help archives; there were a few follow-ups).

I gave this as an example to show that the results of the "little"
arithmetic errors (such as have recently been discussed from many
aspects) can, in certain contexts, destroy a computation.

So be careful to consider what can happen in the particular
context you are working with.

There are ways to dodge the issue -- such as using the R interface
to the 'bc' calculator, which computes arithmetic expressions in
a way which is quite different from the fixed-finite-length binary
representation and algorithms used, not only by R, but also by many
other numerical computation software suites

Best wishes to all,
Ted.

-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 21-Apr-2017  Time: 22:03:15
This message was sent by XFMail