[R-SIG-Mac] Help with locale on OS X

Simon Urbanek simon.urbanek at r-project.org
Tue Apr 25 03:27:57 CEST 2006


On Apr 24, 2006, at 6:02 PM, Don MacQueen wrote:

> For several years I've been running scripts with this expression
>
>    substr(tmp.un[substr(tmp.un,1,1)=='u'],1,1) <- '\265'
>
> whose purpose is to replace the letter 'u' with a
> Greek mu, to change, for example,  'ug/L' to
> 'µg/L'. It has been working over several versions
> of R.
>
> Recently, I started (sometimes?) getting error messages from the  
> expression:
>
>>  substr(tmp.un[substr(tmp.un,1,1)=='u'],1,1) <- '\265'
> Error in "substr<-"(`*tmp*`, 1, 1, value = "<b5>") :
>          invalid multibyte string
>

With high probability this is due to UTF-8 locale being used. You can  
check that easily by looking for
"Natural language support but running in an English locale"
in the greeting.

Starting with R 2.3.0 (due to updated gettext) the locale is  
determined from the system settings. Previously, the system setting  
was always ignored. Right now it is used if no other locale setting  
is set, such as LANG, LC_ALL etc. If you want to force C locale, you  
should run your scripts using something like
LANG=C R
and this is what most scripts do if they want to force no-locale  
environment. Note, however, that UTF-8 is required by the system  
utilities, so forcing a non-UTF-8 locale will prevent you from using  
non-ASCII characters in file names or etc. On Mac OS X is it usually  
safer to write everything in UTF-8 including your scripts.

Cheers,
Simon



More information about the R-SIG-Mac mailing list