[Rd] Bug in 2.4.0 Windows menu setup (PR#9277)

Sat Oct 7 07:52:52 CEST 2006

 iconv  -f  utf-8 -t cp936 RGui-zh_CN.po > RGui-zh_CN.po.cp936
 iconv: illegal input sequence at position 19303

 iconv -c -f  utf-8 -t cp936 RGui-zh_CN.po > RGui-zh_CN.po.cp936
      ^^
 iconv -f cp936 -t utf-8 RGui-zh_CN.po.cp936 > RGui-zh_CN.po.cp936utf8
 diff -uN  RGui-zh_CN.po   RGui-zh_CN.po.cp936utf8
@@ -852,7 +852,7 @@

 #: rui.c:1283 rui.c:1404
 msgid "menu + item is limited to 1000 bytes"
-msgstr "xxx"
+msgstr "xxx"

 grep -C1 "menu + item is limited to 1000 bytes" RGui-zh_CN.po

This should ask a translator for text of a part for a difference.
BTW, there is not a problem in GB18030.

2006/10/7, Duncan Murdoch <murdoch at stats.uwo.ca>:
> On 10/6/2006 1:35 PM, Hin-Tak Leung wrote:
> > Duncan Murdoch wrote:
> >> On 2006-10-5 8:06, Ei-ji Nakama wrote:
> >>> I do not understand Chinese, but recognize kanji.
> >>> RGui-zh_CN.po is written in utf-8, but charset=CP936 wrote.
> >>>
> >>>   perl -p -i -e 's#charset=CP936#charset=utf-8#' RGui-zh_CN.po
> >>>   msgfmt -o RGui.mo RGui-zh_CN.po
> >>
> >> Thanks!!  That does fix the error, at least on my system.  I'll commit
> >> the change to R-devel and R-patched.
> >
> > Hmm, I do understand Chinese, and I can confirm that the content
> > of RGui-zh_CN.po in R 2.4 is in utf-8 rather than CP936.
> >
> > I can also confirm that CP950(big5) for RGui-zh_TW.po is correct, and
> > CP932(shift-JIS) for  RGui-ja.po is also correct. (so you'll need to
> > find some korean to verify CP949 for RGui-ko.po).
> >
> > However, the fix is slightly "asymmetric". Out of ru, zh_CN, zh_TW,
> > ja, ko, only ru in R-2.4.0/po/*.po is in localised encoding,
> > (the others 4 in UTF-8), whereas RGui-*.po, after the fix, all
> > are in localised encoding except RGui-zh_CN.po .
> >
> > I would propose correcting the encoding of the *content*, rather
> > than the charset tag, so that Rgui-* all uses localised ones (CP932,
> > CP936, CP949, CP950). That should be better for older windows...
>
> I did try that, but iconv didn't want to convert the file from UTF-8 to
> CP936.  I've no idea why not.
>
> In any case, those files only need to be readable by the translation
> teams, not by end-users, so I don't think the asymmetry matters:  if a
> translator finds it easy to work in UTF-8 that's fine for R, as long as
> it is correctly recorded.
>
> Duncan Murdoch
>
>
>


-- 
EI-JI Nakama  <nakama at ki.rim.or.jp>
"\u4e2d\u9593\u6804\u6cbb"  <nakama at ki.rim.or.jp>