[R] Input encoding problem when using sweave with xetex
Erich Studerus
erich.studerus at bli.uzh.ch
Wed May 12 17:36:12 CEST 2010
Putting \usepackage[cp1252]{inputenc} into my preamble is not an option,
because XeTeX unlike LaTeX needs UTF-8 has input encoding. My goal is also
to have a LyX document that can be compiled both on Mac and Windows.
I usually compile my Lyx-Sweave documents by one click of a button from
within Lyx. R code chunks are therefore executed by calling R from the
command line. If anybody knows how to run R with options(encoding="UTF-8")
from the command line under windows, that would be helpful.
The command that calls R during compilation is contained in this file:
http://cran.r-project.org/contrib/extra/lyx/preferences
Regards,
Erich
-----Ursprüngliche Nachricht-----
Von: Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
Gesendet: Mittwoch, 12. Mai 2010 16:56
An: Erich Studerus
Cc: r-help at r-project.org
Betreff: Re: [R] Input encoding problem when using sweave with xetex
On 12/05/2010 9:48 AM, Erich Studerus wrote:
> Thanks. Since the encoding of x is unknown (Encoding(x) gives "unknown"),
I
> tried
>
> iconv(x, "", "UTF-8")
>
> Unfortunately, accented letters are still not printed in the final PDF
> output.
>
I think I gave you incomplete advice.
The line above will convert the native encoding to UTF-8. That's
probably fine, but it's not actually helpful.
The problem is that when R outputs a vector, it will convert it back to
the native encoding, unless you take action to stop that. If you don't
mind changing your document for Windows, you can put
\usepackage[cp1252]{inputenc}
into the preamble, and use the Windows native CP1252 encoding
throughout. If you want something that will work in UTF-8 on Windows,
you need to say
options(encoding="UTF-8")
*before* running Sweave. (If you're running Sweave from the command
line using "R CMD Sweave" then I don't know if you can specify the
output encoding; it won't help to do it in the document code chunks).
You also need to put the line
\usepackage[utf8]{inputenc}
into the document preamble, but it sounds as though Lyx has already done
that for you.
Duncan Murdoch
> Regards,
> Erich
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
> Gesendet: Mittwoch, 12. Mai 2010 15:27
> An: Erich Studerus
> Cc: r-help at r-project.org
> Betreff: Re: [R] Input encoding problem when using sweave with xetex
>
> On 12/05/2010 8:37 AM, Erich Studerus wrote:
>
>> Hello
>>
>>
>>
>> Because I want to use different true type fonts with latex, I'm using the
>> XeTeX typesetting engine for my sweave-documents. I'm using Lyx with
>>
> Sweave
>
>> on a Windows 7 PC and have set up LyX to work with XeTeX according to
the
>> following instructions:
>>
>> http://wiki.lyx.org/LyX/XeTeX
>>
>>
>>
>> Because the input file for XeTeX is assumed to be in UTF-8 encoding, I
set
>> the encoding under LyX - Tools - Language Settings - Language to "Unicode
>> (XeTeX) (utf8)". Accented letters that I write into the LyX-document are
>> correctly typeset in the final PDF-document. However, character strings
>>
> with
>
>> accented letters that are read from Excel-files or other sources from
>>
> within
>
>> R during the LyX-Sweave document compilation are not. For instance, the
>> German umlauts of the following example are not correctly typeset, when
>> "Unicode (XeTeX) (utf8)" is used as input encoding.
>>
>>
>>
>> <<echo=F>>=
>>
>> require(gdata)
>>
>> x <- read.xls("http://www.schwerhoerigkeit.pop.ch/hoergeraete_test.xls",
>> stringsAsFactors = F)[2,2]
>>
>> x
>>
>> @
>>
>>
>>
>> I do not have this problem with a Mac computer . I guess, this is because
>>
> R
>
>> under Windows does not use UTF-8 encoding. I tried to change the
>>
> encoding
>
>> within R by doing the following
>>
>>
>>
>> <<echo=F>>=
>>
>> Encoding(x) <- 'UTF-8'
>>
>> x
>>
>> @
>>
>>
>>
>> Unfortunately, this does not work. Does anybody have solution for this
>> problem?
>>
>>
>
> You need to use iconv() to change an encoding. What you did just
> changes the declared encoding, but doesn't actually change any bits. So
> you'd probably get what you want with
>
> x <- iconv(x, "", "UTF-8")
> x
>
> (though you may need to declare the input encoding; it is likely CP1252
> on Windows).
>
>> Duncan Murdoch
>>
>>
>> Regards,
>>
>> Erich
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>
> http://www.R-project.org/posting-guide.html
>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list