[R] a==0 vs as.integer(a)==0 vs all.equal(a,0)

Duncan Murdoch murdoch at stats.uwo.ca
Tue Mar 8 10:56:30 CET 2005


On Tue, 8 Mar 2005 09:03:43 +0000, Robin Hankin
<r.hankin at soc.soton.ac.uk> wrote :

>hi
>
>
>?integer says:
>
>      Note that on almost all implementations of R the range of
>      representable integers is restricted to about +/-2*10^9: 'double's
>      can hold much larger integers exactly.
>
>
>I am getting very confused as to when to use integers and when not to.  
>In my line
>I need exact comparisons of large integer-valued arrays, so I often use 
>as.integer(),
>but the above seems to tell me that doubles might  be better.
>
>Consider the following R idiom of Euclid's algorithm for the highest 
>common factor
>of two positive integers:
>
>   gcd <- function(a, b){
>     if (b == 0){ return(a)}
>     return(Recall(b, a%%b))
>   }
>
>If I call this with gcd(10,12), for example, then  a%%b is not an 
>integer, so the first
>line of the function, testing b for being zero, isn't legitimate.

When you say it isn't legitimate, you mean that it violates the advice
never to use exact comparison on floating point values?

I think that's just advice, it's not a hard and fast rule.  If you
happen to know that the values being compared have been calculated and
stored exactly, then "==" is valid.  In your function, when a and b
are integers that are within some range (I'm not sure what it is, but
it approaches +/- 2^53), the %% operator should return exact results.
(Does it do so on all platforms?  I'm not sure, but I'd call it a bug
if it didn't unless a and/or b were very close to the upper limit of
exactly representable integers.)

Do you know of examples where a and b are integers stored in floating
point, and a %% b returns a value that is different from as.integer(a)
%% as.integer(b)?


>
>OK, so I have some options:
>
>(1) stick in "a <- as.integer(a),  b <- as.integer(b)" into the 
>function:  then a%%b *will* be an
>                integer and the "==" test is appropriate
>(2) use some test like abs(b) < TOL for some suitable TOL (0.5?)
>(3) use identical(all.equal(b,0),TRUE) like it says in identical.Rd
>(4) use identical(all.equal(b,as.integer(0)),TRUE)

I'd suggest

(5) Use your gcd function almost as above, but modified to work on
vectors:

   gcd <- function(a, b){
     result <- a
     nonzero <- b != 0
     if (any(nonzero))
       result[nonzero] <- Recall(b[nonzero], a[nonzero] %% b[nonzero])
     return(result)
   }

>How does the List deal with this kind of problem?
>
>Also, gcd() as written returns a non-integer.  Would the List recommend 
>rewriting the last
>line as
>
>return(as.integer(Recall(b,a%%b)))
>
>or not?

I'd say not.  Your original function returns integer when both a and b
are stored as integers, and double when at least one of them is not.
That seems like reasonable behaviour to me.

Duncan Murdoch




More information about the R-help mailing list