[R] strangely long floating point with write.table()
Duncan Murdoch
murdoch.duncan at gmail.com
Sat Mar 15 02:26:36 CET 2014
On 14-03-14 8:59 PM, Mike Miller wrote:
> What I'm using:
>
> R version 3.0.1 (2013-05-16) -- "Good Sport"
> Copyright (C) 2013 The R Foundation for Statistical Computing
> Platform: x86_64-unknown-linux-gnu (64-bit)
That's not current, but it's not very old...
>
>
> According to some docs, options(digits) controls numerical precision in
> output of write.table(). I'm using the default value for digits:
>
>> getOption("digits")
> [1] 7
>
> I have a bunch of numbers in a data frame that are only a few digits to
> the right of the decimal:
That's not enough to reproduce this. Put together a self-contained
reproducible example if you're wondering why something behaves as it
does. With just a bunch of output, you'll just get uninformed guesses.
Duncan Murdoch
>
>
>> data[c(94,120), c(18,20,21)]
> V18 V20 V21
> 94 0.008 0.008 0.000064
> 120 0.023 0.023 0.000529
>
>
> I write the data to a file:
>
>> write.table(data, file="data.txt", sep="\t", row.names=F, col.names=F)
>
>
> Then I look at those same values and this is what I see:
>
> $ gawk -F'\t' 'NR==94 || NR==120 {print $18,$20,$21}' data.txt
> 0.00800000000000001 0.00800000000000001 6.40000000000001e-05
> 0.0229999999999999 0.0229999999999999 0.000528999999999996
>
>
> This is the weird thing: Only those two records get the long, annoyingly
> "precise" 17-digit numbers. Other records look like this:
>
> 0.052 1.052 1.106704
> 0.178 0.178 0.031684
>
> I understand that binary representations won't reflect decimal
> representations precisely, etc., but why do I get this junk for only two
> records out of 197 records? Records that contain only integral values
> can't get it wrong, but the other 30 of 32 records with decimals look fine
> -- see below.
>
> Also, if precision should be to 7 digits, why am I getting 17 digits for
> exactly two of the records? Why does this happen for all three numbers in
> those two records?
>
> If you think this is a bug that I should report elsewhere, let me know.
>
> Thanks.
>
> Mike
>
>
>
> $ gawk -F'\t' '{print $18,$20,$21}' data.txt | grep -F .
> 0.944 0.944 0.891136
> 0.885 1.885 3.553225
> 0.052 1.052 1.106704
> 0.178 0.178 0.031684
> 1.996 1.996 3.984016
> 0.86 1.86 3.4596
> 0.765 1.765 3.115225
> 0.986 1.986 3.944196
> 0.998 0.998 0.996004
> 0.998 0.998 0.996004
> 0.956 0.956 0.913936
> 0.99 1.99 3.9601
> 0.00800000000000001 0.00800000000000001 6.40000000000001e-05
> 0.99 0.99 0.9801
> 0.0229999999999999 0.0229999999999999 0.000528999999999996
> 0.938 0.938 0.879844
> 0.034 1.034 1.069156
> 0.86 1.86 3.4596
> 0.911 1.911 3.651921
> 0.971 0.971 0.942841
> 0.994 0.994 0.988036
> 0.418 0.418 0.174724
> 0.805 1.805 3.258025
> 0.996 1.996 3.984016
> 0.998 1.998 3.992004
> 0.623 1.623 2.634129
> 0.998 0.998 0.996004
> 1.628 1.628 2.650384
> 0.981 0.981 0.962361
> 0.998 0.998 0.996004
> 1.676 1.676 2.808976
> 0.986 1.986 3.944196
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list