[R-pkg-devel] UX for "WARNING unable to translate to native encoding"?

David Norris d@v|d @end|ng |rom prec|@|onmethod@@guru
Mon Aug 16 17:38:09 CEST 2021


Thanks to you both, Ivan and Tomas. Capital D it is, then.
-David

From: Tomas Kalibera <tomas.kalibera using gmail.com>
Date: Monday, August 16, 2021 at 8:48 AM
To: Ivan Krylov <krylov.r00t using gmail.com>, David Norris <david using precisionmethods.guru>
Cc: "r-package-devel using r-project.org" <r-package-devel using r-project.org>
Subject: Re: [R-pkg-devel] UX for "WARNING unable to translate to native encoding"?


On 8/16/21 12:42 PM, Ivan Krylov wrote:
On Mon, 16 Aug 2021 09:05:54 +0000
David Norris <mailto:david using precisionmethods.guru> wrote:

Unicode U+00d7 (times), U+00b1 (plus-minus) and U+03bc (mu) have
equivalents in Latin-1 encoding, and I have used these without
difficulty in strings, neither U+2206 (INCREMENT) nor U+0394 (Greek
Delta) does
But not in some other locale encodings on Windows (e.g. CP-1251), nor
in some single-byte locale encodings on *nix-like systems (e.g.
ru_RU.KOI8-R), which are admittedly used much rarer nowadays than on
Windows. Unless I'm mistaken, the "\u2206t" in your example needs to
become a symbol, and symbols are always translated into the locale
encoding [1] [2].

I would expect this warning to be a problem for CRAN, but I'm just
another package developer, so I could be wrong.

Yes, this is a problem. Only ASCII characters should be used in symbol 
names (in R packages), as they can be represented in every (supported) 
locale.

Some characters would be best-fitted by Windows (replaced by other 
similar characters) during translation to native encoding, if they are 
not representable directly, but that can produce surprising results and 
should not be relied on, definitely not in packages.

Best
Tomas





More information about the R-package-devel mailing list