[R] UTF-8 or Unicode on Windows PC
Prof Brian Ripley
ripley at stats.ox.ac.uk
Mon Apr 21 19:09:06 CEST 2008
On Mon, 21 Apr 2008, Hans-Joerg Bibiko wrote:
> On 21 Apr 2008, at 12:33, Prof Brian Ripley wrote:
>>> Is it possible to download a compiled snapshot of 2.7.0 for Windows XP?
>> Yes, http://cran.r-project.org/bin/windows/base/rtest.html
>> And it is due for release tomorrow.
> Many thanks! I can see the progress :)
> But please forgive my incompetence. I'm not so familiar with Windows.
> If I start e.g. RGUI by using: Rgui.exe LC_CTYPE=ja I can type Japanese,
> Russian, and German. strsplit works perfectly! ;)
> But if I type for instance a German umlaut 'ü' it comes out as 'u'. OK, it is
> due to the fact I didn't set up Rgui in UTF-8 mode.
Entering at the keyboard in more than one language is close to impossible
(not quite, as 'Japanese' covers a few but you need a Japanese keyboard to
do it). You can't change the language of Windows just by setting locales.
> But how can I do this? My data are written in many different languages, and I
> want to do some statistics.
You can read in files in known encodings, though.
> R version 2.7.0 RC (2008-04-19 r45391)
> all to German_Germany.1252
> There are some minor issues.
> I set Rgui's font to "Arial Unicode". This works but I have some troubles to
> place my cursor, caused by the issue that Arial Unicode is not a monospaced
Right, and you are warned not to do that. You must use a fixed-width
font, and for CJK characters, one in the standard single/double spacing.
(See for example the comments in Rconsole and rw-FAQ 3.5. The GUI
preferrences dialog only offers fixed-width fonts, so you have to work
quite hard to do anything else.)
> If I start up Rgui in German, I can see the localized menu items, but for
> each non-ASCII character I see cryptic things. It seems to me that the
> localized strings are written in UTF-8, and Rgui expects ANSI characters.
Argh, yes, that was an error by the translator in marking the file --
thanks, I just have time to fix it. (RGui does not expect ANSI, but all
of R does expect translations to be in the encoding they are declared to
be-- this eas declared as ISO-8859-1.)
> Nevertheless, thanks a lot!
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help