[R] strangely long floating point with write.table()

Sat Mar 15 22:32:21 CET 2014

On Sat, 15 Mar 2014, Rui Barradas wrote:

> I haven't followed this thread since its start but I think you now have 
> a case for FAQ 7.31. See inline below.
>
> Try
>
>> (1-0.995) - 0.005
> [1] 4.336809e-18
>> (2-1.995) - 0.005
> [1] -1.066855e-16
>
> Hope this helps,

Yes, that does show the problem well, but it isn't an issue of equality of 
numbers, it's an issue of what to print.  The problem is that a number 
like .005 may round down to three digits if we only look at 15 digits, but 
it might not round down if we look at 17 digits.  When the machine 
precision is 2^-52 = 2.22e-16, why are we looking at 17 digits?

We don't want to print 14 digits of nonsense, right?

At least with write() we have some control, but we don't have that control 
with write.table().

But even with write(), I find the behavior a little unusual:

   > options(digits=15)

   > write(1-0.995, file="/dev/stdout")
   0.0050000000000000044
     1234567890123456789 <- I added this

   > write(2-1.995, file="/dev/stdout")
   0.0049999999999998934
     1234567890123456789 <- I added this

If we round at the 15th digit we should have 0.005 followed by nothing but 
zeros, so why doesn't that happen?  Why does write() display 19 digits 
when options(digits=15) is set?

A lot of short decimal numbers don't have exact binary expansions, so this 
is going to happen all the time.  I think we should have an easy way to 
preserve the short numbers as they are without having to present them as 
17 digits long.  This also happens erratically, for the reason you show 
above -- write.table() uses a level of precision that teeters on the edge 
of machine precision, sometimes falling in one direction, sometimes in the 
other.  The result is annoyingly haphazard.

The cause of the problem is like that of the equality problem -- internal 
binary representations -- but the equality problem can't really be solved 
(well, one could use abs(diff)<num, if that worked).  This problem is not 
hard to solve.  In fact, the write() command with options(digits) is 
supposed to solve this.  I guess it does solve it, within the limited uses 
to which write() is usually applied, but it doesn't do what I would expect 
in terms of number of digits.  Don't we want similar functionality for 
write.table(), and related fuctions?  Or maybe just reduce the precision 
of write.table() by some small amount so that machine precision isn't 
constantly causing annoying output?

Thanks.

Mike

-- 
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota
http://scholar.google.com/citations?user=EV_phq4AAAAJ