[R] Sweave encoding problem

Gerrit Voigt gerrit.voigt at campus.tu-berlin.de
Fri Jan 23 10:34:10 CET 2009

The two documents were  actually  different, which I didn't notice 
yesterday. One had different encoding. Thanks for your help Duncan.
Unfortunetly the other problem still exists. My R or Sweave seems not to 
be able to work with utf-8 encoding.  Everything works fine with 
latin-1, though. I could check my assumption if there was a possibility 
to switch R from latin-1 to utf-8. Does anybody have an idea how that 
might work?

Gerrit Voigt

Duncan Murdoch schrieb:
> Gerrit Voigt wrote:
>> Hi Roland,
>> thanks for your answere. I actually tried out a different, smaller  
>> Latex-header and the sweave-process suddenly worked. So I copied 
>> parts of the old header into the new one, to check what part is 
>> causing the trouble. In the end I had two documents with identical 
>> content.  The new document worked fine with Sweave the other still 
>> gave out the error-message. If anybody has experienced that problem 
>> before, and knows an answere, please let me know.
> This sounds like you have discovered homeopathic properties in 
> Sweave!  It will be serious if input files remember errors even after 
> they have been removed.
> But I think it's more likely that the files just look the same in your 
> editor, but are actually different in some way you don't see.  
> Candidates:
> - the encoding:  maybe your editor is recognizing the encoding, and 
> automatically displaying similar content from different input.
> - non-printing characters:  maybe your editor is skipping some.
> I'd suggest doing a binary compare on the two files to see what the 
> differences are.  I think you are on Windows (but I may be misreading 
> the quotes below); I recommend Beyond Compare (a shareware compare 
> utility).  It has a hex viewer plug-in that could show you a detailed 
> comparison.  I imagine diff on Unix has something similar.
> Duncan Murdoch
>> Unfortunaetly I also still have an encoding problem with the new 
>> documt, that ran through Sweave. If I use  "ISO-8859-15" fontencoding 
>> in my editor and "latin1" for input encoding in my Latex-document 
>> everything works fine. If I keep both in "utf8", as I would like it, 
>> german mutated vowels (Umlaute) aren't displayed correctly.
>> Rau, Roland schrieb:
>>> Hi Gerrit,
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org 
>>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Gerrit Voigt
>>>> Sent: Monday, January 19, 2009 4:48 PM
>>>> To: r-help at r-project.org
>>>> Subject: [R] Sweave encoding problem
>>>> Hello,
>>>> Sweave seems to have trouble processing german letters in R.
>>>> For example, my noweb R-input looks like this.
>>>> <<>>=
>>>> Oberflächenfehler = c(4, 11, 6, 2, 7, 9)
>>>> @
>>>> If I send it through Sweave, I get the following error message.
>>>> error:  chunk 1
>>>> Error in parse(text = chunk) : unexpected input in "Oberflä"
>>>> extra: Warning message:
>>>> In readLines(f[1]) :
>>>>    underfull last line in "C:\...."
>>>> (my R is in german, so I needed to translate the error message 
>>>> myself.)
>>>> I got the impression, that this is an encoding issue of Sweave, 
>>>> since  the input typed into R directly works just fine. The 
>>>> encoding I use in  my noweb document is utf8.
>>> I don't think it has something to do with German letters.
>>> I saved the following text in a file 'sweavy.Snw':
>>> \documentclass{article}
>>> \begin{document}
>>> Hello World!
>>> <<>>=
>>> 1+1
>>> @
>>> <<>>=
>>> Oberflächenfehler = c(4, 11, 6, 2, 7, 9)
>>> @
>>> \end{document}
>>> This is what happened in R:
>>>> library(utils)
>>>> Sweave("sweavy.Snw")
>>> Writing to file sweavy.tex
>>> Processing code chunks ...
>>>  1 : echo term verbatim
>>>  2 : echo term verbatim
>>> You can now run LaTeX on 'sweavy.tex'
>>>> sessionInfo()
>>> R version 2.7.0 (2008-04-22) i386-pc-mingw32
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
>>> States.1252;LC_MONETARY=English_United 
>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   
>>> base    
>>> And also the dvi looked fine after processing "latex sweavy.tex"
>>> To make things sure, I did in my editor (GNU Emacs
>>> C-x RET f utf-8
>>> to change set-buffer-file-coding-system to utf-8.
>>> Still works fine.
>>> Maybe this helps you further to track down the reason for the 
>>> problem?!?
>>> Best,
>>> Roland
>>> ----------
>>> This mail has been sent through the MPI for Demographic Research.  
>>> Should you receive a mail that is apparently from a MPI user without 
>>> this text displayed, then the address has most likely been faked. If 
>>> you are uncertain about the validity of this message, please check 
>>> the mail header or ask your system administrator for assistance.
>>     [[alternative HTML version deleted]]
>> ------------------------------------------------------------------------
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list