[R-SIG-Mac] font encoding issue
Simon Urbanek
simon.urbanek at math.uni-augsburg.de
Wed Nov 24 21:10:38 CET 2004
On Nov 23, 2004, at 9:17 PM, Denis Chabot wrote:
> I suppose this is a font encoding error. Can it be fixed, or is there
> something in R itself which prevents it from even displaying such
> characters?
It's a bug and a feature of the R GUI ;).
Internally, R GUI uses UTF-8 encoding for text handing, including the
editor. The idea was to have a localized GUI with support for any
language and UTF-8 is the natively supported format in Cocoa. To make
the mess even bigger, there was a bug in the GUI that converted the
UTF-8 to vanilla C string at one point, thus resulting in the wrong
behavior you spotted.
Now I have fixed that latter bug, such that your comments should appear
undistorted now:
> # exemple à suivre
If this is all you need, get tonight's nightly build.
However, using UTF-8 in strings in R is not that easy. Even if all you
want is to retain the UTF-8 contents (i.e. tell R to not worry about
the encoding and just print back what it gets), the actual problem is
that R escapes certain characters, regardless of the locale:
> Sys.getlocale()
[1] "en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/C"
> "Müll"
[1] "M\303\274ll"
This means that the don't-worry-concept doesn't work. The latest info
on encodings and UTF-8 I could find was for 1.8.1, but I suspect that
nothing changed since: basically R has no UTF-8 support and there will
be none unless someone with enough time, energy and skill will take up
the task.
The bottom line is that I'll try to fix the GUI in a sense that it will
use the locale-specific encoding in its internal representation and for
all communication with R. The drawback will be that users on systems
with different locales won't be able to use each other's files
transparently. Still, this should fix things for users of more simple
encodings (such as Latin1), but for more general support of UTF-8 or
other multi-character encodings we will have to wait until there is a
global solution in R.
Cheers,
Simon
More information about the R-SIG-Mac
mailing list