[R] Sweave encoding problem

Duncan Murdoch murdoch at stats.uwo.ca
Fri Jan 23 13:07:03 CET 2009


Gerrit Voigt wrote:
> The two documents were  actually  different, which I didn't notice 
> yesterday. One had different encoding. Thanks for your help Duncan.
> Unfortunetly the other problem still exists. My R or Sweave seems not to 
> be able to work with utf-8 encoding.  Everything works fine with 
> latin-1, though. I could check my assumption if there was a possibility 
> to switch R from latin-1 to utf-8. Does anybody have an idea how that 
> might work?
>   

Connections and functions that read from them generally have an 
"encoding" argument; I think you need to have that set to "UTF-8" or 
"latin1" as appropriate.  However, Sweave() doesn't offer an option to 
pass that arg down to the readLines() call that actually reads the 
file.  I believe options(encoding="UTF-8") or options(encoding="latin1") 
will set the default if you run it before calling Sweave. 

You will probably find it frustrating to keep switching that option; I'd 
recommend storing files in the native encoding for your system, which R 
will default to using.  (This doesn't work if you share the same file on 
multiple systems, of course.)

Duncan Murdoch


> Gerrit Voigt
>
> Duncan Murdoch schrieb:
>   
>> Gerrit Voigt wrote:
>>     
>>> Hi Roland,
>>> thanks for your answere. I actually tried out a different, smaller  
>>> Latex-header and the sweave-process suddenly worked. So I copied 
>>> parts of the old header into the new one, to check what part is 
>>> causing the trouble. In the end I had two documents with identical 
>>> content.  The new document worked fine with Sweave the other still 
>>> gave out the error-message. If anybody has experienced that problem 
>>> before, and knows an answere, please let me know.
>>>   
>>>       
>> This sounds like you have discovered homeopathic properties in 
>> Sweave!  It will be serious if input files remember errors even after 
>> they have been removed.
>>
>> But I think it's more likely that the files just look the same in your 
>> editor, but are actually different in some way you don't see.  
>> Candidates:
>> - the encoding:  maybe your editor is recognizing the encoding, and 
>> automatically displaying similar content from different input.
>> - non-printing characters:  maybe your editor is skipping some.
>>
>> I'd suggest doing a binary compare on the two files to see what the 
>> differences are.  I think you are on Windows (but I may be misreading 
>> the quotes below); I recommend Beyond Compare (a shareware compare 
>> utility).  It has a hex viewer plug-in that could show you a detailed 
>> comparison.  I imagine diff on Unix has something similar.
>>
>> Duncan Murdoch
>>     
>>> Unfortunaetly I also still have an encoding problem with the new 
>>> documt, that ran through Sweave. If I use  "ISO-8859-15" fontencoding 
>>> in my editor and "latin1" for input encoding in my Latex-document 
>>> everything works fine. If I keep both in "utf8", as I would like it, 
>>> german mutated vowels (Umlaute) aren't displayed correctly.
>>>
>>> Rau, Roland schrieb:
>>>  
>>>       
>>>> Hi Gerrit,
>>>>
>>>>      
>>>>         
>>>>> -----Original Message-----
>>>>> From: r-help-bounces at r-project.org 
>>>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Gerrit Voigt
>>>>> Sent: Monday, January 19, 2009 4:48 PM
>>>>> To: r-help at r-project.org
>>>>> Subject: [R] Sweave encoding problem
>>>>>
>>>>> Hello,
>>>>> Sweave seems to have trouble processing german letters in R.
>>>>> For example, my noweb R-input looks like this.
>>>>> <<>>=
>>>>> Oberflächenfehler = c(4, 11, 6, 2, 7, 9)
>>>>> @
>>>>> If I send it through Sweave, I get the following error message.
>>>>>
>>>>> error:  chunk 1
>>>>> Error in parse(text = chunk) : unexpected input in "Oberflä"
>>>>> extra: Warning message:
>>>>> In readLines(f[1]) :
>>>>>    underfull last line in "C:\...."
>>>>>
>>>>> (my R is in german, so I needed to translate the error message 
>>>>> myself.)
>>>>>
>>>>> I got the impression, that this is an encoding issue of Sweave, 
>>>>> since  the input typed into R directly works just fine. The 
>>>>> encoding I use in  my noweb document is utf8.
>>>>>           
>>>>>           
>>>> I don't think it has something to do with German letters.
>>>> I saved the following text in a file 'sweavy.Snw':
>>>> \documentclass{article}
>>>>
>>>> \begin{document}
>>>> Hello World!
>>>>
>>>> <<>>=
>>>> 1+1
>>>> @
>>>> <<>>=
>>>> Oberflächenfehler = c(4, 11, 6, 2, 7, 9)
>>>> @
>>>> \end{document}
>>>>
>>>> This is what happened in R:
>>>>      
>>>>         
>>>>> library(utils)
>>>>> Sweave("sweavy.Snw")
>>>>>           
>>>>>           
>>>> Writing to file sweavy.tex
>>>> Processing code chunks ...
>>>>  1 : echo term verbatim
>>>>  2 : echo term verbatim
>>>>
>>>> You can now run LaTeX on 'sweavy.tex'
>>>>      
>>>>         
>>>>> sessionInfo()
>>>>>           
>>>>>           
>>>> R version 2.7.0 (2008-04-22) i386-pc-mingw32
>>>> locale:
>>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
>>>> States.1252;LC_MONETARY=English_United 
>>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   
>>>> base    
>>>> And also the dvi looked fine after processing "latex sweavy.tex"
>>>> To make things sure, I did in my editor (GNU Emacs 22.1.50.1)
>>>> C-x RET f utf-8
>>>> to change set-buffer-file-coding-system to utf-8.
>>>> Still works fine.
>>>>
>>>> Maybe this helps you further to track down the reason for the 
>>>> problem?!?
>>>>
>>>> Best,
>>>> Roland
>>>>
>>>> ----------
>>>> This mail has been sent through the MPI for Demographic Research.  
>>>> Should you receive a mail that is apparently from a MPI user without 
>>>> this text displayed, then the address has most likely been faked. If 
>>>> you are uncertain about the validity of this message, please check 
>>>> the mail header or ask your system administrator for assistance.
>>>>
>>>>
>>>>       
>>>>         
>>>     [[alternative HTML version deleted]]
>>>
>>>   
>>> ------------------------------------------------------------------------
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>   
>>>       
>>     
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list