[Rd] deparse() and UTF-8 strings

Gábor Csárdi c@@rd|@g@bor @end|ng |rom gm@||@com
Mon Feb 21 11:33:30 CET 2022


I am wondering if it would make sense to produce \u escaped strings in
deparse() for UTF-8 input. Currently we have (in R-devel):

x <- "G\u00e1bor"
Sys.setlocale("LC_ALL", "C")
#> [1] "C/C/C/C/C/en_US.UTF-8"

deparse(x)
#> [1] "\"G<U+00E1>bor\""

charToRaw(deparse(x))
#> [1] 22 47 3c 55 2b 30 30 45 31 3e 62 6f 72 22

Is there a reason why this is preferable instead of returning

"\"G\\u00e1bor\""

i.e.

charToRaw("\"G\\u00e1bor\"")
#>  [1] 22 47 5c 75 30 30 65 31 62 6f 72 22

Returning the \u escaped form would make deparse() the inverse of
parse(), at least in this respect.

Thank you,
Gabor



More information about the R-devel mailing list