[R] Multiple language output - Correct in RGui, wrong in .txt after sink()

mark.redshaw at evonik.com mark.redshaw at evonik.com
Wed May 19 18:35:06 CEST 2010


I have the following problem with outputting multilingual data to a file. 
I get (except for Korean) what I expect as result in the RGui, but when I 
use sink() to output to a text file loose the characters in the foreign 
languages.
I post a small example below. Since I am not sure how well my email system 
as the list copes with all the different characters I have additionally 
created a pdf version of this example.
The first part of the example behaves as I expect for all languages except 
Korean. I believe that the Korean language may be a problem with the font, 
it would be great if someone could confirm this?
In the second part with output to the txt file I get the <U+FF71> type 
unicode as output not the expected characters. My main problem is how can 
I output the characters as I expect?

> RM_EN <- c("Alfalfa hay","Alfalfa meal","Alfalfa silage")
> RM_DE <- c("Luzerneheu","Lurzernegrünmehl","Luzernesilage")
> RM_RU <- c("Люцерновое сено","Люцерновая травяная мука","Люцерновый 
сенаж")
> RM_CN <- c("苜蓿干草","苜蓿草粉","苜蓿青贮")
> RM_JP <- c("アルファルファ乾草","アルファルファ ミール","アルファルファ 
サイレージ")
> RM_KR <- c("알팔파 건초","알팔파 박","알팔파 사일리지")
> 
> RMLANG <- data.frame(RM_EN,RM_DE,RM_RU,RM_CN,RM_JP,RM_KR)
> nrm <- NROW(RMLANG)
> 
> for(i in 1:nrm)
+ {
+ cat(format("English",    width = 12, justify = c("left")), 
as.character(RMLANG$RM_EN[i]),"\n",sep="")
+ cat(format("Deutsch",    width = 12, justify = c("left")), 
as.character(RMLANG$RM_DE[i]),"\n",sep="")
+ cat(format("Russian",    width = 12, justify = c("left")), 
as.character(RMLANG$RM_RU[i]),"\n",sep="")
+ cat(format("Japanese",   width = 12, justify = c("left")), 
as.character(RMLANG$RM_JP[i]),"\n",sep="")
+ cat(format("Chinese",    width = 12, justify = c("left")), 
as.character(RMLANG$RM_CN[i]),"\n",sep="")
+ cat(format("Korean",    width = 12, justify = c("left")), 
as.character(RMLANG$RM_KR[i]),"\n","\n","\n",sep="")
+ }
English     Alfalfa hay
Deutsch     Luzerneheu
Russian     Люцерновое сено
Japanese    アルファルファ乾草
Chinese     苜蓿干草
Korean      알팔파 건초

English     Alfalfa meal
Deutsch     Lurzernegrünmehl
Russian     Люцерновая травяная мука
Japanese    アルファルファ ミール
Chinese     苜蓿草粉
Korean      알팔파 박

English     Alfalfa silage
Deutsch     Luzernesilage
Russian     Люцерновый сенаж
Japanese    アルファルファ サイレージ
Chinese     苜蓿青贮
Korean      알팔파 사일리지

> for(i in 1:nrm)
+ {
+ sink("output.txt")
+ cat(format("English",    width = 12, justify = c("left")), 
as.character(RMLANG$RM_EN[i]),"\n",sep="")
+ cat(format("Deutsch",    width = 12, justify = c("left")), 
as.character(RMLANG$RM_DE[i]),"\n",sep="")
+ cat(format("Japanese",   width = 12, justify = c("left")), 
as.character(RMLANG$RM_JP[i]),"\n",sep="")
+ cat(format("Chinese",    width = 12, justify = c("left")), 
as.character(RMLANG$RM_CN[i]),"\n",sep="")
+ cat(format("Korean",     width = 12, justify = c("left")), 
as.character(RMLANG$RM_KR[i]),"\n","\n","\n",sep="")
+ sink()
+ }
> 
Output.txt contains:
""
English     Alfalfa hay
Deutsch     Luzerneheu
Japanese    <U+FF71><U+FF99><U+FF8C><U+FF67><U+FF99><U+FF8C><U+FF67><U+4E7
Chinese     <U+82DC><U+84FF><U+5E72><U+8349>
Korean      <U+C54C><U+D314><U+D30C> <U+AC74><U+CD08>

English     Alfalfa meal
Deutsch     Lurzernegrünmehl
Japanese    <U+FF71><U+FF99><U+FF8C><U+FF67><U+FF99><U+FF8C><U+FF67> <U+FF
Chinese     <U+82DC><U+84FF><U+8349><U+7C89>
Korean      <U+C54C><U+D314><U+D30C> <U+BC15>

English     Alfalfa silage
Deutsch     Luzernesilage
Japanese    <U+FF71><U+FF99><U+FF8C><U+FF67><U+FF99><U+FF8C><U+FF67> <U+FF
Chinese     <U+82DC><U+84FF><U+9752><U+8D2E>
Korean      <U+C54C><U+D314><U+D30C> <U+C0AC><U+C77C><U+B9AC><U+C9C0>
""



many thanks
Mark Redshaw
 Mark Redshaw 
Animal Nutrition Services 
Evonik Degussa GmbH, HN-M-AN, Rodenbacher Chaussee 4, 63457 Hanau, Germany 

Tel: +49 61 81 59 6788 
www.aminoacidsandmore.com 


More information about the R-help mailing list