[R] strangely long floating point with write.table()

peter dalgaard pdalgd at gmail.com
Sat Mar 15 08:15:55 CET 2014


On 15 Mar 2014, at 04:03 , Mike Miller <mbmiller+l at gmail.com> wrote:

> On Fri, 14 Mar 2014, Duncan Murdoch wrote:
> 
>> On 14-03-14 8:59 PM, Mike Miller wrote:
>>> What I'm using:
>>> R version 3.0.1 (2013-05-16) -- "Good Sport"
>>> Copyright (C) 2013 The R Foundation for Statistical Computing
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>> 
>> That's not current, but it's not very old...
>> 
>>> According to some docs, options(digits) controls numerical precision in output of write.table().  I'm using the default value for digits:
>>>> getOption("digits")
>>> [1] 7
>>> I have a bunch of numbers in a data frame that are only a few digits to the right of the decimal:

I don't think so. I think some of your numbers differ sufficiently from numbers with only a few digits to the right of the decimal that write.table needs to write them with increased precision. You didn't read them like that, didn't you? You did some calculations, and then it _looked like_ the results have <= 6 digits after the decimal point?

Watch this:

> dd <- read.table(text="
+ a b c
+ 0.998 0.998 0.996004
+ 0.956 0.956 0.913936
+ 0.99 1.99 3.9601
+ 0.00800000000000001 0.00800000000000001 6.40000000000001e-05
+ 0.99 0.99 0.9801
+ 0.0229999999999999 0.0229999999999999 0.000528999999999996
+ ",header=TRUE)

> dd
      a     b        c
1 0.998 0.998 0.996004
2 0.956 0.956 0.913936
3 0.990 1.990 3.960100
4 0.008 0.008 0.000064
5 0.990 0.990 0.980100
6 0.023 0.023 0.000529

> write.table(dd, sep="\t", row.names=FALSE, col.names=FALSE)
0.998	0.998	0.996004
0.956	0.956	0.913936
0.99	1.99	3.9601
0.00800000000000001	0.00800000000000001	6.40000000000001e-05
0.99	0.99	0.9801
0.0229999999999999	0.0229999999999999	0.000528999999999996

> round(dd,7)
      a     b        c
1 0.998 0.998 0.996004
2 0.956 0.956 0.913936
3 0.990 1.990 3.960100
4 0.008 0.008 0.000064
5 0.990 0.990 0.980100
6 0.023 0.023 0.000529

> write.table(round(dd,7), sep="\t", row.names=FALSE, col.names=FALSE)
0.998	0.998	0.996004
0.956	0.956	0.913936
0.99	1.99	3.9601
0.008	0.008	6.4e-05
0.99	0.99	0.9801
0.023	0.023	0.000529

Notice that the _relative_ error in those numbers has snuck up into the 1e-15 range:

> (dd - round(dd,7))/dd
              a             b             c
1  0.000000e+00  0.000000e+00  0.000000e+00
2  0.000000e+00  0.000000e+00  0.000000e+00
3  0.000000e+00  0.000000e+00  0.000000e+00
4  1.301043e-15  1.301043e-15  1.694066e-15
5  0.000000e+00  0.000000e+00  0.000000e+00
6 -4.374520e-15 -4.374520e-15 -7.378313e-15

The digits= setting has nothing to do with this, write.table alway does its damndest to avoid loss of precision. This _is_ in help(write.table):

     In almost all cases the conversion of numeric quantities is
     governed by the option ‘"scipen"’ (see ‘options’), but with the
     internal equivalent of ‘digits = 15’.  For finer control, use
     ‘format’ to make a character matrix/data frame, and call
     ‘write.table’ on that.



>> 
>> That's not enough to reproduce this.  Put together a self-contained reproducible example if you're wondering why something behaves as it does. With just a bunch of output, you'll just get uninformed guesses.
> 
> 
> Thanks for the tip.  Here's what I've done:
> 
>> data2 <- data[c(94,120),c(18,20,21)]
>> save(data2, file="data2.Rdata")
>> q("no")
> 
> $ R
>> load("data2.Rdata")
>> data2
>      V18   V20      V21
> 94  0.008 0.008 0.000064
> 120 0.023 0.023 0.000529
>> write.table(data2, file="data2.txt", sep="\t", row.names=F, col.names=F)
> 
> $ cat data2.txt
> 0.00800000000000001     0.00800000000000001     6.40000000000001e-05
> 0.0229999999999999      0.0229999999999999      0.000528999999999996
> 
> The data2.Rdata file is attached to this message.
> 
> I guess that is enough to reproduce this exact finding.  I don't know how it works in general.
> 
> I don't have a newer version of R available right now.  It did the same thing on an older version (2.15.1).
> 
> Interestingly, on a different machine with an even older version (2.12.2) I see something a little different:
> 
> 0.008   0.008   6.40000000000001e-05
> 0.0229999999999999      0.0229999999999999      0.000528999999999996
> 
> Best,
> Mike______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com




More information about the R-help mailing list