[R] replacing unicode characters
Ivan Krylov
kry|ov@r00t @end|ng |rom gm@||@com
Fri Jun 30 13:10:27 CEST 2023
On Fri, 30 Jun 2023 11:33:34 +0300
Adrian Dușa <dusa.adrian using unibuc.ro> wrote:
> In a very simple test, I tried creating a text file from the Electron
> app embedded R:
> sink("test.txt")
> cat("\u00e7")
> sink()
>
> which resulted in:
>
> <U+00E7>
>
> I don't quite understand how this works, my best guess is it matters
> less how R interprets these characters, but how they are passed
> through the child process that started R.
Something goes wrong with the locale setting when the R child process
is being launched. For example,
Rscript -e 'cat("\ue7\n")'
# ç
but:
LC_ALL=C Rscript -e 'cat("\ue7\n")'
# <U+00E7>
When preparing \ue7 for output, R decides that it's not representable
in the session encoding. What's the output of sessionInfo() and
l10n_info() in the child process?
--
Best regards,
Ivan
More information about the R-help
mailing list