[Rd] test fails when requesting LC_CTYPE

Martin Maechler maechler at stat.math.ethz.ch
Sat May 20 15:35:10 CEST 2017


>>>>> Kasper Daniel Hansen <kasperdanielhansen at gmail.com>
>>>>>     on Fri, 19 May 2017 20:09:24 -0400 writes:

    > I rebuilt R with
    > export LC_CTYPE=en_US.UTF-8
    > and the test still fail.  Surprisingly, when I run R from the bin directory
    > and execute the test code, it runs without error:

  >> oloc <- Sys.getlocale("LC_CTYPE")
  >> mbyte.lc <- {
  > +     if(.Platform$OS.type == "windows")
  > +       "English_United States.28605"
  > +     else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically nowadays
  > +       oloc
  > +     else
  > +       "C.UTF-8" # or rather "en_US.UTF-8" (? from  system("locale -a| fgrep .UTF-8") )
  > + }
  >> stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc))
  >> oloc
  > [1] "en_US.UTF-8"
  >> mbyte.lc
  > [1] "en_US.UTF-8"

I had been making these changes in R-devel after offline
discussions with Linux users for which the original check (using "en_UK.UTF-8")
failed.

What I read below is suggesting that "C.UTF-8" is not okay
either, as a fallback.

It seems we should use "en_US.UTF-8" as fallback instead
(though I assume that won't work in North Korea).

I've committed a version that does that _and_ no longer stops
when that identical() does not give a 'TRUE'.

Martin

    > On Fri, May 19, 2017 at 7:29 PM, Kasper Daniel Hansen <
    > kasperdanielhansen at gmail.com> wrote:

    >> On RedHat Enterprise Linux 6, the test below fails (this is using the
    >> stock GCC 4.4.7) from R-devel r72707.  LC_CTYPE is unset when I run it, but
    >> LANG=en_US.UTF-8
    >> 
    >> It also failed "yesterday" where as far as I recall the test code looked a
    >> bit different.
    >> 
    >> Best,
    >> Kasper
    >> 
    >> > ## Results differed by platform, but some gave incorrect results on
    >> string 10.
    >> >
    >> >
    >> > ## str() on large strings (in multibyte locales; changing locale may not
    >> work everywhere
    >> > oloc <- Sys.getlocale("LC_CTYPE")
    >> > mbyte.lc <- {
    >> +     if(.Platform$OS.type == "windows")
    >> +       "English_United States.28605"
    >> +     else if(grepl("[.]UTF-8$", oloc, ignore.case=TRUE)) # typically
    >> nowadays
    >> +       oloc
    >> +     else
    >> +       "C.UTF-8" # or rather "en_US.UTF-8" (? from  system("locale -a|
    >> fgrep .UTF-8") )
    >> + }
    >> > stopifnot(identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc))
    >> Error: identical(Sys.setlocale("LC_CTYPE", mbyte.lc), mbyte.lc) is not
    >> TRUE
    >> In addition: Warning message:
    >> In Sys.setlocale("LC_CTYPE", mbyte.lc) :
    >> OS reports request to set locale to "C.UTF-8" cannot be honored
    >> Execution halted
    >>



More information about the R-devel mailing list