[Rd] use of UTF-8 \uxxxx escape sequences in function arguments
Thomas Zumbrunn
thomas at zumbrunn.name
Wed Jan 18 23:54:43 CET 2012
While preparing a function that contained non-ASCII characters for inclusion
into a package, I replaced all non-ASCII characters with UTF-8 escape
sequences (using \uxxxx) in order to make the package portable (and adhere to
"R CMD check"). What I didn't expect: when one uses UTF-8 escape sequences in
function arguments, one needs to use UTF-8 escape sequences when calling the
function, too - even when working in a UTF-8 locale. Is this an intended
behaviour?
Here's an example to illustrate the (putative) problem:
## function that uses non-ASCII characters in arguments
plain <- function(myarg = c("Basel", "Bern", "Zürich")) {
myarg <- match.arg(myarg)
}
## function that uses UTF-8 escape sequences in arguments
escaped <- function(myarg = c("Basel", "Bern", "Z\u00BCrich")) {
myarg <- match.arg(myarg)
}
## test
plain("Zürich") ## works
plain("Z\u00BCrich") ## fails
escaped("Zürich") ## fails
escaped("Z\u00BCrich") ## works
Thank you for your help.
Thomas Zumbrunn
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8
[4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8
LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
More information about the R-devel
mailing list