[R-pkg-devel] ASCII code for Degree symbol °

Ivan Krylov kry|ov@r00t @end|ng |rom gm@||@com
Mon Jan 24 09:23:42 CET 2022


On Sun, 23 Jan 2022 21:09:09 -0500
<dbosak01 using gmail.com> wrote:

> vec1 <- gsub("[\xB0]", ".", vec)

A great degree of care is needed with this.

Encoding('\xB0') is "unknown", i.e. \xXX escape codes are assumed to be
bytes in your native system encoding. On GNU/Linux and other systems
where native encoding is UTF-8 (and not Latin-1 or an ANSI code page),
'\xB0' is an invalid byte sequence, not a degree symbol:

'\xB0' == '°'
# [1] FALSE

On the other hand, the code point for ° is also U+00B0:

as.hexmode(utf8ToInt('°'))
# [1] "b0"
'\ub0'
# [1] "°"

The difference is that Encoding('\ub0') is "UTF-8" and is therefore
portable between systems with different native encodings.

-- 
Best regards,
Ivan



More information about the R-package-devel mailing list