[R-pkg-devel] UX for "WARNING unable to translate to native encoding"?

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Mon Aug 16 14:48:32 CEST 2021


On 8/16/21 12:42 PM, Ivan Krylov wrote:
> On Mon, 16 Aug 2021 09:05:54 +0000
> David Norris <david using precisionmethods.guru> wrote:
>
>> Unicode U+00d7 (times), U+00b1 (plus-minus) and U+03bc (mu) have
>> equivalents in Latin-1 encoding, and I have used these without
>> difficulty in strings, neither U+2206 (INCREMENT) nor U+0394 (Greek
>> Delta) does
> But not in some other locale encodings on Windows (e.g. CP-1251), nor
> in some single-byte locale encodings on *nix-like systems (e.g.
> ru_RU.KOI8-R), which are admittedly used much rarer nowadays than on
> Windows. Unless I'm mistaken, the "\u2206t" in your example needs to
> become a symbol, and symbols are always translated into the locale
> encoding [1] [2].
>
> I would expect this warning to be a problem for CRAN, but I'm just
> another package developer, so I could be wrong.
>
Yes, this is a problem. Only ASCII characters should be used in symbol 
names (in R packages), as they can be represented in every (supported) 
locale.

Some characters would be best-fitted by Windows (replaced by other 
similar characters) during translation to native encoding, if they are 
not representable directly, but that can produce surprising results and 
should not be relied on, definitely not in packages.

Best
Tomas



More information about the R-package-devel mailing list