[Rd] For numeric x, as.character(x) doesn't always match signif(x, 15)

Suharto Anggono Suharto Anggono suharto_anggono at yahoo.com
Wed Oct 2 09:49:24 CEST 2013


I saw something like this.

> x <- 5180000000000003
> print(x, digits=20)
[1] 5180000000000003
> as.character(x)
[1] "5.18e+15"

I thought it was because, when x is numeric, as.character(x) represents x rounded to 15 significant digits.

> print(signif(x, 15), digits=20)
[1] 5180000000000000.0000
> as.numeric(as.character(x)) == signif(x, 15)
[1] TRUE

The documentation for 'as.character' in R states this in "Details" section.

     'as.character' represents real and complex numbers to 15
     significant digits (technically the compiler's setting of the ISO
     C constant 'DBL_DIG', which will be 15 on machines supporting
     IEC60559 arithmetic according to the C99 standard).  This ensures
     that all the digits in the result will be reliable (and not the
     result of representation error), but does mean that conversion to
     character and back to numeric may change the number.  If you want
     to convert numbers to character with the maximum possible
     precision, use 'format'.

But then, I was surprised when I also saw this, where as.character(x) didn't match signif(x, 15).

> x <- 1234567890123456
> print(x, digits=20)
[1] 1234567890123456
> as.character(x)
[1] "1234567890123456"
> print(signif(x, 15), digits=20)
[1] 1234567890123460
> as.numeric(as.character(x)) == signif(x, 15)
[1] FALSE

Then, I found another example of this behavior in https://stat.ethz.ch/pipermail/r-devel/2009-May/053341.html.

It seems that, for numeric, the result of 'as.character' equals format(., digits=15) applied to each element individually. Is it always the case?

> format(5180000000000003, digits=15)
[1] "5.18e+15"
> format(1234567890123456, digits=15)
[1] "1234567890123456"

I assume that format(x, digits=15) behaves like print(x, digits=15).

> print(5180000000000003, digits=15)
[1] 5.18e+15
> print(1234567890123456, digits=15)
[1] 1234567890123456

The result of
print(1234567890123456, digits=15)
violates the part
"at least one entry will be encoded with that minimum number"
in "Details" section in the documentation for 'print.default'.

     The same number of decimal places is used throughout a vector.
     This means that 'digits' specifies the minimum number of
     significant digits to be used, and that at least one entry will be
     encoded with that minimum number.

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_3.0.2



More information about the R-devel mailing list