[Rd] Error on Windows build: "unable to re-encode"

Duncan Murdoch murdoch at stats.uwo.ca
Sat Feb 27 23:59:44 CET 2010

On 27/02/2010 2:38 AM, Felix Schönbrodt wrote:
> Thanks for your help - that was the solution (easy enough to remove these two characters - they've been in only comments anyway).
> Fortunately, the DECRIPTION file accepts umlauts, as in my second name. The problem was only in the source file.

I've changed R-devel so that it now gives a warning instead of an error 
in such cases.  The warning reports the line numbers of the bad 
characters, and the installer converts them to <xx>-style hex codes.  If 
you've used them in variable names this will likely lead to a syntax 
error; in string literals it will look ugly but should be accepted.  In 
comments it will look ugly, but comments aren't normally saved, so they 
won't really matter there.

Duncan Murdoch

> Felix
> Am 26.02.2010 um 18:37 schrieb Duncan Murdoch:
>> On 26/02/2010 11:05 AM, Felix Schönbrodt wrote:
>>> Hi Duncan,
>>> I now declared the endcoding in the DESCRIPTION to UTF-8 (and all files are encoded in that way, too). As my last name is "Schönbrodt", I'd be happy to see it that way in the package ;-)
>>> However, it still doesn't build on Windows (but works on Mac and Linux). 
>>> Unfortunately I cannot build the Windows packages myself (I work on a Mac), but the win-builder by Uwe Ligges still shows the same error ...
>>>> If declaring the encoding in DESCRIPTION doesn't solve the problem, I'd be happy to take a look at the package.
>>> That's a great offer! I'd be very happy if you could take a look.
>>> You can find the source at http://r-forge.r-project.org/projects/tripler/, a tar.gz is attached as well.
>> I got the same error as you.  It looks as though iconv has trouble with the way some characters are encoded in your file.  For example, on line 893, you have a u-umlaut encoded as EF BF BD.  According the the UTF-8 tables at http://www.utf8-chartable.de/unicode-utf8-table.pl?start=65280, that encodes a question mark in a diamond, "REPLACEMENT CHARACTER".  There's no corresponding character in the standard Windows latin1 encoding, so conversion fails.  Firefox can display the funny question mark, but it doesn't display the u-umlaut as you intended, so I think this is an error in your file.
>> A way to find all such errors is as follows:  read the file as utf-8, then use the iconv() function in R to convert it to latin1.  When I do that, I get NA on lines 893 and 953, which are displayed to me as
>> [1] "\t# im latenten Fall: die Error variance erst am Ende berechnen (d.h., alle error componenten �ber alle Gruppen mitteln, die unter NUll auf Null setzen, dann addieren)"
>> [2] "\t\t# TODO: �berpr�fen!"    
>> We might be able to make the error message in the package installer more informative (e.g. giving the line number that failed).  I'll look into that.
>> Duncan Murdoch
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list