[Rd] issue with print()ing multibyte characters on R 4.0.4
Prof Brian Ripley
r|p|ey @end|ng |rom @t@t@@ox@@c@uk
Wed Feb 17 11:20:03 CET 2021
On 17/02/2021 04:58, Hiroaki Yutani wrote:
> Hi all,
> I saw several people on Japanese locale claim that, on R 4.0.4,
> print() doesn't display
> Japanese characters correctly. This seems to happen only on Windows
> and on macOS (I
> usually use Linux and I don't see this problem).
> For example, in the result below, "鬼" and "外" are displayed in
> "\uXXXX" format. What's
> curious here is that "は" is displayed as it is, by the way.
>  "\u9b3cは\u5916"
> But, if I use such functions as message() or cat(), the string is
> displayed as it is.
that does not escape non-printable characters, so as expected.
> Considering the fact that it seems only Windows and macOS are
> affected, I suspect this
> is somehow related to this change described in the release note,
> (though I have no idea
> what change this is):
> The internal table for iswprint (used on Windows, macOS and AIX) has been
> updated to include many recent Unicode characters.
> Before I'm going to file this issue on Bugzilla, I'd like to confirm
> if this is not the intended
> change, and, if this is actually intended, I want to discuss how to
> improve this behaviour.
I am sorry: this was not intended but it was no one reported in the run
up to 4.0.4. It seems to be working in R-devel so I suggest you check
that or go back to 4.0.3.
It looks like a line in the iswprint table got deleted in the merge from
R-devel. I will try to set up some automated checks to see if I can
find any other problems, but that will take a few days.
Brian D. Ripley, ripley using stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
More information about the R-devel