[R] Plotting the ASCII character set.

Ivan Krylov kry|ov@r00t @end|ng |rom gm@||@com
Sat Jul 3 09:40:28 CEST 2021


Hello Rolf Turner,

On Sat, 3 Jul 2021 14:02:59 +1200
Rolf Turner <r.turner using auckland.ac.nz> wrote:

> Can anyone suggest how I might get my plot_ascii() function working
> again?  Basically, it seems to me, the question is:  how do I persuade
> R to read in "\260" as "\ub0" rather than "\xb0"?

Part of the problem is that the "\xb0" byte is not in ASCII, which
covers only the lower half of possible 8-bit bytes. I guess that the
strings containing bytes with highest bit set used to be interpreted as
Latin-1 on your machine, but now get interpreted as UTF-8, which
changes their meaning (in UTF-8, the highest bit being set indicates
that there will be more bytes to follow, making the string invalid if
there is none).

The good news is, since it's Latin-1, which is natively supported by R,
there are even multiple options:

1. Mark the string as Latin-1 by setting Encoding(a) <- 'latin1' and
let R do the re-encoding if and when Pango asks it for a UTF-8-encoded
string.

2. Decode Latin-1 into the locale encoding by using iconv(a, 'latin1',
'') (or set the third parameter to 'UTF-8', which would give almost the
same result on a machine with a UTF-8 locale). The result is, again, a
string where Encoding(a) matches the truth. Explicitly setting UTF-8
may be preferable on Windows machines running pre-UCRT builds of R
where the locale encoding may not contain all Latin-1 characters, but
that's not a problem for you, as far as I know.

For any encoding other than Latin-1 or UTF-8, option (2) is still valid.

I have verified that your example works on my GNU/Linux system with a
UTF-8 locale if I use either option.

-- 
Best regards,
Ivan



More information about the R-help mailing list