[R] bug?

Wed Jul 26 20:11:38 CEST 2006

On Wed, 26 Jul 2006, Patrick Jahn wrote:

> Dear All,
> if you generate a sequence with small latitude like:
>
> x<-seq(0,1,0.005)
>
> and you ask for all points of this lattice how many points are in a neighbourhood with radius 0.01
> of each point:
>
> v <- rep( 0 , length( x ) ) ;
> for (i in 1:length(x) )
>     {  v[i] <- length(x[ abs(x-x[i]) < 0.01 ] ) ;   };
>
> then the answer should be:  v = (2, 3, 3, 3, 3,.......,3, 3, 3, 3, 2), because every point instead
> of the borders has 3 points in a 0.01-neighbourhood.
>
> but v contains also many 4 and also 5:
>
>> v
>  [1] 2 4 3 4 4 3 4 4 3 4 4 3 4 4 4 4 5 4 4 5 4 4 5 4 4 4 3 4 4 4 4 3 3 3 4 4 4
> [38] 4 3 3 4 4 4 4 3 3 4 4 4 4 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 4 4 4 4 3
> [75] 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3
> [112] 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3
> [149] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
> [186] 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 2
>
> Could anyone explain this fact and help me to compute exactly on general 
> data.
>

Yes and no.

The fact is easily explained: 0.005 and 0.01 are not exactly representable 
in floating point, and so it will not be true for all x that x+0.005+0.005 
= x+0.01. This is a FAQ.

For this problem an easy solution is to multiply by 200 (or 1000) and work 
with integers, which can be exactly represented.  There is no solution for 
general data, although software for arbitrary precision floating point may 
come close (there was a message yesterday from someone trying to interface 
pari/gp, which does this, with R).

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle