[Rd] For numeric x, as.character(x) doesn't always match signif(x, 15)
Suharto Anggono Suharto Anggono
suharto_anggono at yahoo.com
Wed Oct 2 09:49:24 CEST 2013
I saw something like this.
> x <- 5180000000000003
> print(x, digits=20)
[1] 5180000000000003
> as.character(x)
[1] "5.18e+15"
I thought it was because, when x is numeric, as.character(x) represents x rounded to 15 significant digits.
> print(signif(x, 15), digits=20)
[1] 5180000000000000.0000
> as.numeric(as.character(x)) == signif(x, 15)
[1] TRUE
The documentation for 'as.character' in R states this in "Details" section.
'as.character' represents real and complex numbers to 15
significant digits (technically the compiler's setting of the ISO
C constant 'DBL_DIG', which will be 15 on machines supporting
IEC60559 arithmetic according to the C99 standard). This ensures
that all the digits in the result will be reliable (and not the
result of representation error), but does mean that conversion to
character and back to numeric may change the number. If you want
to convert numbers to character with the maximum possible
precision, use 'format'.
But then, I was surprised when I also saw this, where as.character(x) didn't match signif(x, 15).
> x <- 1234567890123456
> print(x, digits=20)
[1] 1234567890123456
> as.character(x)
[1] "1234567890123456"
> print(signif(x, 15), digits=20)
[1] 1234567890123460
> as.numeric(as.character(x)) == signif(x, 15)
[1] FALSE
Then, I found another example of this behavior in https://stat.ethz.ch/pipermail/r-devel/2009-May/053341.html.
It seems that, for numeric, the result of 'as.character' equals format(., digits=15) applied to each element individually. Is it always the case?
> format(5180000000000003, digits=15)
[1] "5.18e+15"
> format(1234567890123456, digits=15)
[1] "1234567890123456"
I assume that format(x, digits=15) behaves like print(x, digits=15).
> print(5180000000000003, digits=15)
[1] 5.18e+15
> print(1234567890123456, digits=15)
[1] 1234567890123456
The result of
print(1234567890123456, digits=15)
violates the part
"at least one entry will be encoded with that minimum number"
in "Details" section in the documentation for 'print.default'.
The same number of decimal places is used throughout a vector.
This means that 'digits' specifies the minimum number of
significant digits to be used, and that at least one entry will be
encoded with that minimum number.
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.0.2
More information about the R-devel
mailing list