[Rd] Floating point maths in R
Tom McCallum
tom.mccallum at levelelimited.com
Sat Dec 9 14:29:01 CET 2006
Hi,
I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the
wrong way or if there is a more serious bug. I was having problems
getting some calculations to add up so I ran the following tests:
> (2.34567 - 2.00000) == 0.34567 <------- should be true
[1] FALSE
> (2.23-2.00) == 0.23 <------- should be true
[1] FALSE
> 4-2==2
[1] TRUE
> (4-2)==2
[1] TRUE
> (4.0-2)==2
[1] TRUE
> (4.0-2.0)==2
[1] TRUE
> (4.0-2.0)==2.0
[1] TRUE
> (4.2-2.2)==2.0
[1] TRUE
> (4.20-2.20)==2.00
[1] TRUE
> (4.23-2.23)==2.00 <------- should be true
[1] FALSE
> (4.230-2.230)==2.000 <------- should be true
[1] FALSE
> (4.230-2.230)==2.00 <------- should be true
[1] FALSE
> (4.230-2.23)==2.00 <------- should be true
[1] FALSE
I have tried these on both 64 and 32-bit machines. Surely R should be
able to do maths to 2 decimal places and be able to test these simple
expressions? The problem occurs as in the 16th decimal place junk is
being placed by the FPU it seems. I have also tried:
> (4.2300000000000000-2.230000000000000) == 2
[1] FALSE
> a <- (4.2300000000000000-2.230000000000000)
> a == 2
[1] FALSE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000000
[1] FALSE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <-- correct
> when add 16th decimal place to 4
[1] TRUE
> (4.2300000000000000-2.230000000000000) == 2.00000000000000043 <-- any
> values after the 16th decimal place mean that the expression is true
[1] TRUE
> (4.2300000000000000-2.230000000000000) == 2.000000000000000435
[1] TRUE
Also :
> (4.2300000000000000-2.230000000000000) == 2.0000000000000001
[1] FALSE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000003
[1] TRUE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000004
[1] TRUE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000005
[1] TRUE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5 I
> can understand being true if rounding occurring, but 6?
[1] TRUE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000007
[1] FALSE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000008
[1] FALSE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000009
[1] FALSE
> (4.2300000000000000-2.230000000000000) == 2.0000000000000010
This is an example of junk being added in the FPU
> formatC(a, digits=20)
[1] "2.0000000000000004441"
I don't know if this is just a formatC error when using more than 16
decimal places or if this junk is what is stopping the equality from being
true:
> formatC(a, digits=16)
[1] " 2"
> formatC(a, digits=17) <-- 16 decimal places, 17 significant figures
> shown
[1] "2.0000000000000004" <-- the problem is the 4 at the end
Obviously the bytes are divided between the exponent and mantissa in
16-16bit share it seems, but this doesn't account for the 16th decimal
place behaviour does it?
If any one has a work around or reason why this should occur it would be
useful to know.
what I would like is to be able to do sums such as (2.3456 - 2 ) == 0.3456
and get a sensible answer - any suggestions? Currently the only way is
for formatC the expression to a known number of decimal places - is there
a better way?
Many thanks
Tom
--
Dr. Thomas McCallum
Systems Architect,
Level E Limited
ETTC, The King's Buildings
Mayfield Road,
Edinburgh EH9 3JL, UK
Work +44 (0) 131 472 4813
Fax: +44 (0) 131 472 4719
http://www.levelelimited.com
Email: tom at levelelimited.com
Level E is a limited company incorporated in Scotland. The c...{{dropped}}
More information about the R-devel
mailing list