[Rd] test fails when requesting LC_CTYPE
Martin Maechler
maechler at stat.math.ethz.ch
Sat May 20 15:35:10 CEST 2017
>>>>> Kasper Daniel Hansen <kasperdanielhansen at gmail.com>
>>>>> on Fri, 19 May 2017 20:09:24 -0400 writes:
> I rebuilt R with
> export LC_CTYPE=en_US.UTF-8
> and the test still fail. Surprisingly, when I run R from the bin directory
> and execute the test code, it runs without error:
>> oloc <- Sys.getlocale("LC_CTYPE")
>> mbyte.lc <- {
> + if(.Platform$OS.type == "windows")
> + "English_United States.28605"
> + else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically nowadays
> + oloc
> + else
> + "C.UTF-8" # or rather "en_US.UTF-8" (? from system("locale -a| fgrep .UTF-8") )
> + }
>> stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc))
>> oloc
> [1] "en_US.UTF-8"
>> mbyte.lc
> [1] "en_US.UTF-8"
I had been making these changes in R-devel after offline
discussions with Linux users for which the original check (using "en_UK.UTF-8")
failed.
What I read below is suggesting that "C.UTF-8" is not okay
either, as a fallback.
It seems we should use "en_US.UTF-8" as fallback instead
(though I assume that won't work in North Korea).
I've committed a version that does that _and_ no longer stops
when that identical() does not give a 'TRUE'.
Martin
> On Fri, May 19, 2017 at 7:29 PM, Kasper Daniel Hansen <
> kasperdanielhansen at gmail.com> wrote:
>> On RedHat Enterprise Linux 6, the test below fails (this is using the
>> stock GCC 4.4.7) from R-devel r72707. LC_CTYPE is unset when I run it, but
>> LANG=en_US.UTF-8
>>
>> It also failed "yesterday" where as far as I recall the test code looked a
>> bit different.
>>
>> Best,
>> Kasper
>>
>> > ## Results differed by platform, but some gave incorrect results on
>> string 10.
>> >
>> >
>> > ## str() on large strings (in multibyte locales; changing locale may not
>> work everywhere
>> > oloc <- Sys.getlocale("LC_CTYPE")
>> > mbyte.lc <- {
>> + if(.Platform$OS.type == "windows")
>> + "English_United States.28605"
>> + else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically
>> nowadays
>> + oloc
>> + else
>> + "C.UTF-8" # or rather "en_US.UTF-8" (? from system("locale -a|
>> fgrep .UTF-8") )
>> + }
>> > stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc))
>> Error: identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc) is not
>> TRUE
>> In addition: Warning message:
>> In Sys.setlocale("LC_CTYPE", mbyte.lc) :
>> OS reports request to set locale to "C.UTF-8" cannot be honored
>> Execution halted
>>
More information about the R-devel
mailing list