[R-SIG-Mac] Reading in a table originally with ISO-latin1 encoding (in Linux)
Seppo Nyrkkö
seppo.nyrkko at helsinki.fi
Wed May 16 17:39:10 CEST 2007
Dear mac & R users,
Returning to this issue, I and Antti found out this certain problem
with R.app and Scandinavian characters was triggered by the Mac OS X's
system-wide language locale set to "C" (POSIX) in the OS X installation
phase.
(details follow)
On June 22, 2006 at 19:43, Antti Arppe wrote:
Dear colleagues,
>
> With the help of a colleague of mine here in Helsinki (Seppo Nyrkkö)
> who looked at the innards of the R source code for Mac it turned out
> that this was in the end indeed an issue concerning the Mac locale and
> its settings and not R.
>
> Though we had tried this earlier by changing the LANG variable to
> 'fi_FI', we hadn't looked hard enough in the available encodings (with
> locale -a) to select the exactly correct value, being:
>
> LANG=fi_FI.IS08859-1; export LANG;
>
> With this configuration R was able to happily read in my original
> table with the Scandinavian characters in the header, without no fuss.
>
> Thanks for your advice, and wishing all a good Midsummer,
>
> -Antti Arppe
>
At the startup, R checks whether it is running in an international
character set locale or not. The locale information is inherited from
the parent process, i.e. the os x window server, which reads locale
settings from the system-wide settings. This information describes
which characters are printable, and which should be displayed as
substituted characters during the whole R session. The POSIX C locale
allows only displaying 7-bit ASCII characters, and disables any
printing of the scandinavian characters (ä,ö,å) in R.app.
First step of recovery is to change the system from the C locale to an
international locale which allows utf-8 character sequences (can be
done through System Preferences). This enables proper output of unicode
characters in the R.app terminal.
Then, to read and write files in the latin-1 (iso-8859-1) character set
(note that the system does utf-8 by default now), one should change the
default encoding for file operations by commanding
'options(encoding="iso-8859-1")'
at the command prompt. It is also possible to add this setting in the
startup file ".Rprofile" in the project startup directory.
Changing the locale in the command-line shell session (either by hand
or in the shell profile script) might not be the best solution here,
since other locale-aware OS X applications, launched from the window
manager, would remain in the C locale.
with best regards,
Seppo
More information about the R-SIG-Mac
mailing list