[R-pkg-devel] Unicode Name Warnings for Package Constant

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Fri Sep 3 08:31:16 CEST 2021


In the current R releases, R symbol names are in the native encoding, 
which on Windows in R (the normal MSVCRT builds up to R 4.1) cannot be 
UTF-8 nor any other Unicode encoding. So you can't portably use such 
characters e.g. in names of vector elements or of bindings in an 
environment, so not even as keys in a hash map implemented using an 
environment.

This works in UCRT builds of Windows on recent Windows 10 (and on Unix 
and macOS for many years), but before that becomes the norm, it might be 
easiest to give up on this feature in your package. But, in principle, 
you can work with UTF-8 strings, sometimes (one needs to check the 
docs), but they cannot be R symbols. So you can do some operations e.g. 
with a vector like

c(mu = "\u00b5")

where the UTF-8 string is a value, not a name.

Best
Tomas


On 9/2/21 5:24 PM, bill using denney.ws wrote:
> Hello,
>
>   
>
> In the janitor package, we want to optionally support conversion from
> Unicode characters that visually map to mu or micro to the character "u".
> For that, we were thinking to create an unexported character vector constant
> with names of all the Unicode mu/micro characters and values of "u".  As a
> work-around, I was able to fix the issue using setNames(), but it was a
> non-intuitive fix, and I would prefer to just use create the named character
> vector directly.
>
>   
>
> Is there a good way to prevent the warnings below?
>
>   
>
> When running the following (on Windows 10 with R 4.1.0) either during a
> normal R session or while checking the package (via devtools::check()), we
> get several warnings:
>
>   
>
> mu_to_u <-
>
>    c(
>
>      "\u00b5"="u", "\u03bc"="u", "\u3382"="u", "\u338c"="u", "\u338d"="u",
>
>      "\u3395"="u", "\u339b"="u", "\u33b2"="u", "\u33b6"="u", "\u33bc"="u"
>
>    )
>
>   
>
>    Warnings in file 'R/clean_names.R':
>
>      unable to translate '<U+3382>' to native encoding
>
>      unable to translate '<U+338C>' to native encoding
>
>      unable to translate '<U+338D>' to native encoding
>
>      unable to translate '<U+3395>' to native encoding
>
>      unable to translate '<U+339B>' to native encoding
>
>      unable to translate '<U+33B2>' to native encoding
>
>      unable to translate '<U+33B6>' to native encoding
>
>      unable to translate '<U+33BC>' to native encoding
>
>   
>
> I tried wrapping this in suppressWarnings(), but the warnings still
> occurred.
>
>   
>
> Thanks,
>
>   
>
> Bill
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list