[Rd] Floating point maths in R
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Sat Dec 9 14:48:06 CET 2006
Tom McCallum wrote:
> Hi,
>
> I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the
> wrong way or if there is a more serious bug. I was having problems
> getting some calculations to add up so I ran the following tests:
>
>
Please read FAQ 7.31 and the reference therein.
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
(short answer: You can not represent thirds exactly in decimal nor
tenths in binary.)
>> (2.34567 - 2.00000) == 0.34567 <------- should be true
>>
> [1] FALSE
>
>> (2.23-2.00) == 0.23 <------- should be true
>>
> [1] FALSE
>
>> 4-2==2
>>
> [1] TRUE
>
>> (4-2)==2
>>
> [1] TRUE
>
>> (4.0-2)==2
>>
> [1] TRUE
>
>> (4.0-2.0)==2
>>
> [1] TRUE
>
>> (4.0-2.0)==2.0
>>
> [1] TRUE
>
>> (4.2-2.2)==2.0
>>
> [1] TRUE
>
>> (4.20-2.20)==2.00
>>
> [1] TRUE
>
>> (4.23-2.23)==2.00 <------- should be true
>>
> [1] FALSE
>
>> (4.230-2.230)==2.000 <------- should be true
>>
> [1] FALSE
>
>> (4.230-2.230)==2.00 <------- should be true
>>
> [1] FALSE
>
>> (4.230-2.23)==2.00 <------- should be true
>>
> [1] FALSE
>
> I have tried these on both 64 and 32-bit machines. Surely R should be
> able to do maths to 2 decimal places and be able to test these simple
> expressions? The problem occurs as in the 16th decimal place junk is
> being placed by the FPU it seems. I have also tried:
>
>
>> (4.2300000000000000-2.230000000000000) == 2
>>
> [1] FALSE
>
>> a <- (4.2300000000000000-2.230000000000000)
>> a == 2
>>
> [1] FALSE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000000
>>
> [1] FALSE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <-- correct
>> when add 16th decimal place to 4
>>
> [1] TRUE
>
>> (4.2300000000000000-2.230000000000000) == 2.00000000000000043 <-- any
>> values after the 16th decimal place mean that the expression is true
>>
> [1] TRUE
>
>> (4.2300000000000000-2.230000000000000) == 2.000000000000000435
>>
> [1] TRUE
>
> Also :
>
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000001
>>
> [1] FALSE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000003
>>
> [1] TRUE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000004
>>
> [1] TRUE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000005
>>
> [1] TRUE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5 I
>> can understand being true if rounding occurring, but 6?
>>
> [1] TRUE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000007
>>
> [1] FALSE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000008
>>
> [1] FALSE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000009
>>
> [1] FALSE
>
>> (4.2300000000000000-2.230000000000000) == 2.0000000000000010
>>
>
>
> This is an example of junk being added in the FPU
>
>> formatC(a, digits=20)
>>
> [1] "2.0000000000000004441"
>
> I don't know if this is just a formatC error when using more than 16
> decimal places or if this junk is what is stopping the equality from being
> true:
>
>
>> formatC(a, digits=16)
>>
> [1] " 2"
>
>> formatC(a, digits=17) <-- 16 decimal places, 17 significant figures
>> shown
>>
> [1] "2.0000000000000004" <-- the problem is the 4 at the end
>
> Obviously the bytes are divided between the exponent and mantissa in
> 16-16bit share it seems, but this doesn't account for the 16th decimal
> place behaviour does it?
>
> If any one has a work around or reason why this should occur it would be
> useful to know.
>
> what I would like is to be able to do sums such as (2.3456 - 2 ) == 0.3456
> and get a sensible answer - any suggestions? Currently the only way is
> for formatC the expression to a known number of decimal places - is there
> a better way?
>
> Many thanks
>
> Tom
>
>
>
More information about the R-devel
mailing list