[R] Sweave encoding problem
Duncan Murdoch
murdoch at stats.uwo.ca
Fri Jan 23 13:07:03 CET 2009
Gerrit Voigt wrote:
> The two documents were actually different, which I didn't notice
> yesterday. One had different encoding. Thanks for your help Duncan.
> Unfortunetly the other problem still exists. My R or Sweave seems not to
> be able to work with utf-8 encoding. Everything works fine with
> latin-1, though. I could check my assumption if there was a possibility
> to switch R from latin-1 to utf-8. Does anybody have an idea how that
> might work?
>
Connections and functions that read from them generally have an
"encoding" argument; I think you need to have that set to "UTF-8" or
"latin1" as appropriate. However, Sweave() doesn't offer an option to
pass that arg down to the readLines() call that actually reads the
file. I believe options(encoding="UTF-8") or options(encoding="latin1")
will set the default if you run it before calling Sweave.
You will probably find it frustrating to keep switching that option; I'd
recommend storing files in the native encoding for your system, which R
will default to using. (This doesn't work if you share the same file on
multiple systems, of course.)
Duncan Murdoch
> Gerrit Voigt
>
> Duncan Murdoch schrieb:
>
>> Gerrit Voigt wrote:
>>
>>> Hi Roland,
>>> thanks for your answere. I actually tried out a different, smaller
>>> Latex-header and the sweave-process suddenly worked. So I copied
>>> parts of the old header into the new one, to check what part is
>>> causing the trouble. In the end I had two documents with identical
>>> content. The new document worked fine with Sweave the other still
>>> gave out the error-message. If anybody has experienced that problem
>>> before, and knows an answere, please let me know.
>>>
>>>
>> This sounds like you have discovered homeopathic properties in
>> Sweave! It will be serious if input files remember errors even after
>> they have been removed.
>>
>> But I think it's more likely that the files just look the same in your
>> editor, but are actually different in some way you don't see.
>> Candidates:
>> - the encoding: maybe your editor is recognizing the encoding, and
>> automatically displaying similar content from different input.
>> - non-printing characters: maybe your editor is skipping some.
>>
>> I'd suggest doing a binary compare on the two files to see what the
>> differences are. I think you are on Windows (but I may be misreading
>> the quotes below); I recommend Beyond Compare (a shareware compare
>> utility). It has a hex viewer plug-in that could show you a detailed
>> comparison. I imagine diff on Unix has something similar.
>>
>> Duncan Murdoch
>>
>>> Unfortunaetly I also still have an encoding problem with the new
>>> documt, that ran through Sweave. If I use "ISO-8859-15" fontencoding
>>> in my editor and "latin1" for input encoding in my Latex-document
>>> everything works fine. If I keep both in "utf8", as I would like it,
>>> german mutated vowels (Umlaute) aren't displayed correctly.
>>>
>>> Rau, Roland schrieb:
>>>
>>>
>>>> Hi Gerrit,
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: r-help-bounces at r-project.org
>>>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Gerrit Voigt
>>>>> Sent: Monday, January 19, 2009 4:48 PM
>>>>> To: r-help at r-project.org
>>>>> Subject: [R] Sweave encoding problem
>>>>>
>>>>> Hello,
>>>>> Sweave seems to have trouble processing german letters in R.
>>>>> For example, my noweb R-input looks like this.
>>>>> <<>>=
>>>>> Oberflächenfehler = c(4, 11, 6, 2, 7, 9)
>>>>> @
>>>>> If I send it through Sweave, I get the following error message.
>>>>>
>>>>> error: chunk 1
>>>>> Error in parse(text = chunk) : unexpected input in "Oberflä"
>>>>> extra: Warning message:
>>>>> In readLines(f[1]) :
>>>>> underfull last line in "C:\...."
>>>>>
>>>>> (my R is in german, so I needed to translate the error message
>>>>> myself.)
>>>>>
>>>>> I got the impression, that this is an encoding issue of Sweave,
>>>>> since the input typed into R directly works just fine. The
>>>>> encoding I use in my noweb document is utf8.
>>>>>
>>>>>
>>>> I don't think it has something to do with German letters.
>>>> I saved the following text in a file 'sweavy.Snw':
>>>> \documentclass{article}
>>>>
>>>> \begin{document}
>>>> Hello World!
>>>>
>>>> <<>>=
>>>> 1+1
>>>> @
>>>> <<>>=
>>>> Oberflächenfehler = c(4, 11, 6, 2, 7, 9)
>>>> @
>>>> \end{document}
>>>>
>>>> This is what happened in R:
>>>>
>>>>
>>>>> library(utils)
>>>>> Sweave("sweavy.Snw")
>>>>>
>>>>>
>>>> Writing to file sweavy.tex
>>>> Processing code chunks ...
>>>> 1 : echo term verbatim
>>>> 2 : echo term verbatim
>>>>
>>>> You can now run LaTeX on 'sweavy.tex'
>>>>
>>>>
>>>>> sessionInfo()
>>>>>
>>>>>
>>>> R version 2.7.0 (2008-04-22) i386-pc-mingw32
>>>> locale:
>>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>>> States.1252;LC_MONETARY=English_United
>>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods
>>>> base
>>>> And also the dvi looked fine after processing "latex sweavy.tex"
>>>> To make things sure, I did in my editor (GNU Emacs 22.1.50.1)
>>>> C-x RET f utf-8
>>>> to change set-buffer-file-coding-system to utf-8.
>>>> Still works fine.
>>>>
>>>> Maybe this helps you further to track down the reason for the
>>>> problem?!?
>>>>
>>>> Best,
>>>> Roland
>>>>
>>>> ----------
>>>> This mail has been sent through the MPI for Demographic Research.
>>>> Should you receive a mail that is apparently from a MPI user without
>>>> this text displayed, then the address has most likely been faked. If
>>>> you are uncertain about the validity of this message, please check
>>>> the mail header or ask your system administrator for assistance.
>>>>
>>>>
>>>>
>>>>
>>> [[alternative HTML version deleted]]
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list