[Rd] R on Windows with UCRT and the system encoding
Hiroaki Yutani
yut@n|@|n| @end|ng |rom gm@||@com
Tue Dec 21 06:34:32 CET 2021
Hi,
I'm more than excited about the announcement about the upcoming UTF-8
R on Windows. Let me confirm my understanding. Is R 4.2 supposed to
work on Windows with non-UTF-8 encoding as the system locale? I think
this blog post indicates so (as this describes the older Windows than
the UTF-8 era), but I'm not fully confident if I understand the
details correctly.
https://developer.r-project.org/Blog/public/2021/12/07/upcoming-changes-in-r-4.2-on-windows/index.html
If so, I'm curious what the package authors should do when the locales
are different between OS and R. For example (disclaimer: I don't
intend to blame processx at all. Just for an example), the CRAN check
on the processx package currently fails with this warning on R-devel
Windows.
> 1. UTF-8 in stdout (test-utf8.R:85:3) - Invalid multi-byte character at end of stream ignored
https://cran.r-project.org/web/checks/check_results_processx.html
As far as I know, processx launches an external process and captures
its output, and I suspect the problem is that the output of the
process is encoded in non-UTF-8 while R assumes it's UTF-8. I
experienced similar problems with other packages as well, which
disappear if I switch the locale to the same one as the OS by
Sys.setlocale(). So, I think it would be great if there's some
guidance for the package authors on how to handle these properly.
Any suggestions?
Best,
Yutani
More information about the R-devel
mailing list