[R] Strange behaviour of as.integer()

(Ted Harding) Ted.Harding at manchester.ac.uk
Fri Jan 8 19:13:44 CET 2010


See below.

>> From: r-help-bounces at r-project.org 
>> [mailto:r-help-bounces at r-project.org] On Behalf Of Ulrich Keller
>> Sent: Thursday, January 07, 2010 4:32 AM
>> To: r-help at r-project.org
>> Subject: [R] Strange behaviour of as.integer()
>> 
>> I have encountered a strange behaviour of as.integer() which does not
>> seem correct to me. Sorry if this is just an indication of me not
>> understanding floating point arithmetic.
>> 
>> > .57 * 100
>> [1] 57
>> > .29 * 100
>> [1] 29
>> 
>> So far, so good. But:
>> 
>> > as.integer(.57 * 100)
>> [1] 56
>> > as.integer(.29 * 100)
>> [1] 28
>> 
>> Then again:
>> 
>> > all.equal(.57 * 100, as.integer(57))
>> [1] TRUE
>> > all.equal(.29 * 100, as.integer(29))
>> [1] TRUE
>> 
>> This behaviour is the same in R 2.10.1 (Ubuntu and Windows) and 2.9.2
>> (Windows), all 32 bit versions. Is this really intended?

I would like to add a salutory tail-piece to this correspondence.
It is a simple recursive calculation which, after not many steps,
R will in almost all cases get badly wrong as a result of the
finite binary representation of fractions. (I have posted it before,
some years ago, but it may be worth bringing it back since it shows
very vividly what can go wrong if you do not pay attention to this
aspect of numerical computation).

Working in the interval [0,1], X[n+1] is calculated from x[n]
according to:

  if 0 <= X[n] <= 1/2 then X[n+1] = 2*X[n]
  if 1/2 <= X[n] <= 1 then X[n+1] = 2*(1 - X[n])

X = 2/3 is a fixed point of this, since 2/3 -> 2*(1 - 2/3) = 2/3;
X = 2/5 -> 4/5 -> 2*(1 - 4/5) = 2/5 has period 2;
X = 2/7 -> 4/7 -> 2*(1 - 4/7) = 6/7 -> 2*(1 - 6/7) = 2/7 (period 3)
and so on.

All fractions which are multiples of 1/(2^k) for some k>0 eventually
end up at 0 and stay there. Irrational numbers, mathematically,
never repeat.

However, none of the above periodic numbers 2/(2*m + 1) can be
represented exactly in a finite binary representation, and that is
where the trouble starts. So, in R:

  nextX <- function(x){if(x <= 0.5) (2*x) else (2*(1 - x))}

Now try the fixed point x=2/3:

  i<-0; x<-2/3; while(x>0){i<-(i+1); x<-nextX(x) ; print(c(i,x))}

of which the last few lines are:

  [1] 46.0000000  0.6640625
  [1] 47.000000  0.671875
  [1] 48.00000  0.65625
  [1] 49.0000  0.6875
  [1] 50.000  0.625
  [1] 51.00  0.75
  [1] 52.0  0.5
  [1] 53  1
  [1] 54  0

Similarly try any of the other periodic values, e.g.

  i<-0; x<-2/11; while(x>0){i<-(i+1); x<-nextX(x) ; print(c(i,x))}

They will all halt at x=0 after 50-55 iterations. Similarly
a non-periodic number such as 1/sqrt(2) or 1/pi will also fail.

Thus the results of such calculations will eventually be grossly
wrong. So do not try this kind of calculation in R! -- at any rate
not without adopting special measures. For instance, it would be
possible, for rational x = m/n, to program a function which kept
track of M and N in the rational expression M/N of the result after
each iteration, by working out what M and N would be. Then, at any
iteration, the numerical value of M/N could be computed (to within
the precision used by R) and returned.

This would still give wrong answers for irrational starting numbers,
since they would have to be stored as rational approximations and
then would either end up at 0 or be periodic while the mathematical
result would never repeat, so eventually they would be arbitrarily
far apart (within [0,1]).

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Jan-10                                       Time: 18:13:41
------------------------------ XFMail ------------------------------



More information about the R-help mailing list