[R] Writing Unicode Text into Text File from R (in Windows)
Duncan Murdoch
murdoch.duncan at gmail.com
Tue Feb 4 13:48:18 CET 2014
On 14-02-04 5:49 AM, Majid Einian wrote:
> Dear R Helpers,
>
> See the Code:
>
> a <- intToUtf8(1777)
> show(a)
> zz <- file(description="test.txt",open="w",encoding="UTF-8")
> cat(a, file = zz)
> close(zz)
>
> in a Unicode aware environment (such as RGui console or RStudio Console)
> you will see this as output:
>
> [1] "Û±"
>
>
> but the character is not written correctly in the file test.txt (which is
> encoded in UTF-8 without BOM) :
>
> <U+06F1>
>
> The problem seems to be this: R changes text to the locale of system (for
> me this is Arabic Windows (Codepage 1256) that does not have a relevant
> code for U+06F1, then changes it back to UTF-8 and writes it into file.
> What do I miss here?
> How can I write a Unicode string into a text file correctly?
There are a lot of places in R where it converts strings to the local
encoding, perhaps too many. On the other hand, maybe Windows should be
offering UTF-8 locales by now.
I haven't tested in your locale, but I believe writeLines() to a
connection declared to be in a UTF-8 encoding will maintain the
encoding. You can declare a file to be in encoding "UTF-8-BOM" if you
want to ignore a BOM on input; I forget whether it will write one on
output. If it doesn't, you can always write one explicitly.
I was hoping to make some progress on this before R 3.1.0 so that more
cases of writing strings to UTF-8 files would work, but time is running out.
Duncan Murdoch
>
>
> Majid Einian,
> Economics Researcher, Monetary and Banking Research Institute, Central Bank
> of Islamic Republic of Iran, Tehran, IRAN
> and
> PhD Candidate in "Economics", Graduate School of Management and
> Economics, Sharif University of Technology, Tehran, IRAN
>
> [[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list