[R-pkg-devel] UTF-8 characters inside R functions for data transformation

Simon Urbanek @|mon@urb@nek @end|ng |rom R-project@org
Tue Dec 21 21:59:41 CET 2021


Xavier,

short answer is no, because there is no guarantee that user's system supports any encoding other than ASCII, so that code wouldn't run. Hence you can't use non-ASCII characters in symbols.

That said, you can use Unicode _strings_, so metadata[["\u00e1cc\u00e9nts"]] will work in ASCII-locale, but I would strongly caution against such objects in public packages.

Cheers,
Simon


> On Dec 22, 2021, at 9:18 AM, X FiM <xfim.ll using gmail.com> wrote:
> 
> Dear all,
> 
> Somewhat related to a question that I posted a while ago (see
> https://stat.ethz.ch/pipermail/r-package-devel/2021q4/007540.html), once
> I've got a dataframe in my cache, some of the functions need to use some of
> the variables. It turns out that some of the columns contain UTF-8
> characters, and therefore I need to be able to call
> `metadata$variable.with.áccénts`.
> 
> But the package development checks warn me that no non-ASCII characters are
> allowed in the files "checking R files for non-ASCII characters ...
> WARNING".
> 
> I have specified that the package uses UTF-8 in DESCRIPTION ("Encoding:
> UTF-8"). I have also defined "options(encoding = "UTF-8")" before calling
> the checks, but nothing seems to matter. I have also tried to give the
> proper UTF-8 codes, like in `metadata$variable.with.\u00e1cc\u00e9nts`, but
> with no luck either. Also it gives an error with "\uxxxx sequences not
> supported inside backticks".
> 
> So which approach do you recommend? Is there any solution that I can use to
> call variables that use non-ASCII characters?
> 
> Thank you very much.
> 
> -- 
> Xavier
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> 



More information about the R-package-devel mailing list