[Rd] Problem with UTF-8 text in the Rcmdr package

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Sep 7 13:22:55 CEST 2008


The issue appears to be the Rcmdr output window and menus.  They are done 
using Tcl/Tk, not by R.  So this might be a problem in Tcl/Tk or the fonts 
it uses, or it might be problem with what Rcmdr passes to the tcltk 
package.

We need the means to reproduce this (as per the posting guide):

- what OSes are affected?  Does this occur in a UTF-8 locale on Linux, for 
example?

- in what locales?

- what versions of Tcl/Tk?  Note that shipped with Windows R 
changed between 2.5.1 and 2.7.x.

- Is this anything to do with translations?  I've not looked at how 
translations are done in Rcmdr, but if gettext() is used, the string 
passed to R for output is in the native encoding, so 'UTF-8 characters' is 
incorrect.  It is possible that it is an iconv problem if the translations 
are supplied in UTF-8 and not Latin-2.

There are far too many layers involved here to guess at what is going on.
My guess is that it ought to be possible to give a simple example of a 
string which can be output to the Rcmdr console and will be rendered 
incorrectly (together with a screen shot of how it is rendered).

I think the characters referred to are the Unicode glyphs 's and z with 
caron', \u0161 and \u017E.  It seems that these will only be displayable 
in Rcmdr on Windows in a Latin-2 locale, which I do not have set up on 
Windows (but believe I could get installed).  However, examples using that 
(and the menus) seem to be correct in both sl_SI.iso88592 and sl_SI.utf8 
on Linux, which suggests that this is probably not an R issue but a Tcl/Tk 
one.

On Fri, 5 Sep 2008, John Fox wrote:

> Dear list members,
>
> I've attached some email correspondence with Jaro Lajovic (with his 
> permission), detailing a problem with the Slovenian translation file for 
> the Rcmdr package.

Unfortunately, it is not 'detailed', and we do need the details.

> In brief, while certain UTF-8 characters used in Slovenian used to 
> appear properly in older versions of R, some characters do not display 
> properly in the Rcmdr menus and output window under R 2.7.x. I've 
> confirmed the problem with the current version of the Rcmdr package 
> (1.4-0) and R 2.7.2 under Windows Vista.
>
> I've checked the R docs and NEWS file for changes to R, but wasn't able 
> to turn up anything that seemed relevant. Frankly, however, my 
> understanding of how various character sets are handled is only partial.
>
> Any help would be appreciated.
>
> John
>
> ------------------------------
> John Fox, Professor
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada
> web: socserv.mcmaster.ca/jfox
>
>
> -----Original Message-----
> From: Jaro.Lajovic [mailto:Jaro.Lajovic at mf.uni-lj.si]
> Sent: August-26-08 2:57 AM
> To: John Fox
> Subject: Re: Slovenian Rcmdr .po and .mo - and a problem
>
> Dear John,
>
>> That seems to imply that there's a change in R rather than in the Rcmdr
>> that produced this problem. Do you notice the problem with any other
>> packages that use translation or with R itself?
>
> As for other translated R packages, I am afraid I am not aware of any.
> However, a quick test using cat with special characters:
> cat "ČŠŽčšž\n"
> reveals that the string prints OK in the R (2.7.1.) console. The command
> line also shows OK in the Rcmdr Script window, but does not display
> right in the Output window. Special chars also fail in the Messages window.
>
> Input (Script window) thus seems not to be affected, while the menu
> system and output do not work properly.
>
> Thank you very much,
> Jaro
>
>
>> On Mon, 25 Aug 2008 21:54:43 +0200
>>  "Jaro.Lajovic" <Jaro.Lajovic at mf.uni-lj.si> wrote:
>>> Dear John,
>>>
>>>> One question though: I assume from your message that the previous
>>>> version of the Rcmdr worked OK with R 2.7.1. Is that right?
>>> No, the version 1.3-5 (that I still have with R 2.5.1) does not work
>>> with R 2.7.1 either. So:
>>>
>>> Rcmdr 1.3-5 with R 2.5.1: works OK.
>>> Rcmdr 1.3-5 with R 2.7.1: does not work properly.
>>> Rcmdr 1.4-0 with R 2.7.1: does not work properly.
>>>
>>> Thank you in advance,
>>> Jaro
>>>
>>>
>>>
>>>> On Mon, 25 Aug 2008 18:52:32 +0200
>>>>  "Jaro.Lajovic" <Jaro.Lajovic at mf.uni-lj.si> wrote:
>>>>> Dear John,
>>>>>
>>>>> Please find attached zipped Slovenian versions of .po (plain text
>>> and
>>>>> UTF-8 coded text) and .mo files.
>>>>>
>>>>> However, there seems to be a problem I have not been able to
>>> resolve.
>>>>> While special characters display properly under R version 2.5.1
>>> with
>>>>> Rcmdr 1.3-5, they fail to display (= are substituted by black
>>> blocks)
>>>>> under R version 2.7.1 with the new Rcmdr 1.4-0. By the way: the
>>> .mo
>>>>> file of the ver. 1.3-5 copied to 1.4-0 also failed to display
>>>>> properly.
>>>>>
>>>>> (An additional detail: three special characters that are used in
>>> the
>>>>> Slo version are c, s and z with hacek. c with hacek is not
>>> affected,
>>>>> it is just s and z with hacek that are not displayed OK.)
>>>>>
>>>>> Your advice will be much appreciated.
>>>>>
>>>>> With best regards,
>>>>> Jaro
>>
>> --------------------------------
>> John Fox, Professor
>> Department of Sociology
>> McMaster University
>> Hamilton, Ontario, Canada
>> http://socserv.mcmaster.ca/jfox/
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-devel mailing list