[Rd] R on Windows with UCRT and the system encoding

Tue Dec 21 06:34:32 CET 2021

Hi,

I'm more than excited about the announcement about the upcoming UTF-8
R on Windows. Let me confirm my understanding. Is R 4.2 supposed to
work on Windows with non-UTF-8 encoding as the system locale? I think
this blog post indicates so (as this describes the older Windows than
the UTF-8 era), but I'm not fully confident if I understand the
details correctly.

https://developer.r-project.org/Blog/public/2021/12/07/upcoming-changes-in-r-4.2-on-windows/index.html

If so, I'm curious what the package authors should do when the locales
are different between OS and R. For example (disclaimer: I don't
intend to blame processx at all. Just for an example), the CRAN check
on the processx package currently fails with this warning on R-devel
Windows.

>     1. UTF-8 in stdout (test-utf8.R:85:3) - Invalid multi-byte character at end of stream ignored
https://cran.r-project.org/web/checks/check_results_processx.html

As far as I know, processx launches an external process and captures
its output, and I suspect the problem is that the output of the
process is encoded in non-UTF-8 while R assumes it's UTF-8. I
experienced similar problems with other packages as well, which
disappear if I switch the locale to the same one as the OS by
Sys.setlocale(). So, I think it would be great if there's some
guidance for the package authors on how to handle these properly.

Any suggestions?

Best,
Yutani