[R-sig-Debian] invalid multibyte string at '<a0>'

Matthieu Stigler matthieu.stigler at gmail.com
Tue Aug 16 16:30:34 CEST 2011


Thanks a lot Anne!

I'm somehow glad to see the problem is confirmed... indeed one can 
manipulate the file, but at best I would have wished to have a more 
direct "R solution"... (there are quite many files).

Anyone would have other suggestion?

Thanks!!

Le 16/08/2011 12:51, Anne Ghisla a écrit :
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Tue, 16 Aug 2011 11:53:53 +0200
> Matthieu Stigler<matthieu.stigler at gmail.com>  wrote:
>
>> Hi
>>
>> I have a problem reading files from Windows... these files have,
>> instead of NA on last column. special ending '<a0>' which makes
>> problem... This problem does not appear while reading the same file
>> in Windows! Try:
>> read.csv("http://dl.dropbox.com/u/6113358/prob.csv")
>>
>> Could you please tell me if you also have this problem? I have tried
>> either by cleaning the file on Linux (fromdos, recode...) or setting
>> different encodings... did'nt work!
> Hi Matthieu, list,
>
>> Any idea? Should be obvious  feel :-(
> I can reproduce the error with the command above. For curiosity I
> opened it in Mousepad (a very simple text editor) and I didn't see any
> sign of the special line endings.
> Opening the file with head showed question marks in correspondence of
> NAs. (I use UTF-8 encoding.)
>
> [anne at galadriel ~]$ head /tmp/prob.csv
> Location,,Time Period,Low-birth-weight newborns (%)
> Afghanistan,,2004,�
> Albania,,2009,�
>
> This is the encoding information given by file:
> [anne at galadriel ~]$ file /tmp/prob.csv
> /tmp/prob.csv: ISO-8859 English text, with CRLF line terminators
>
> In Mousepad I used Replace command, selected the last whitespace of the
> first line that gave the error, and replaced all of them with nothing.
> I saved the file with a new name, and importing it in R gave a correct
> dataframe with NAs at the right place.
>
> I don't know much about proper solutions to encoding issues... still I
> hope this solves your issue. It can be useful to look for the problem in
> the program that generates the file.
>
>> Thanks!!
>>
>> Matthieu
> all the best,
> Anne
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
>
> iQIcBAEBAgAGBQJOSku/AAoJEJIUallkssB4do0P/RDR2FDNBOhrrIEV/KOZXU0O
> 19vDTy8Sc9mkNCwDHH3zAFC1neMGlSOq9JSFVCOXHLtSxXLQ5uWXnQsm51Ochfa2
> TqoZMfEI4eA2zaHJJsSAV55W+nMYrVGWrw/JpAC/M0RMaSftBc+GxkftlkAXiA5r
> PopfdChrEs88helaxzXhRE7v5aiqJlLZ+JWfFRL2NmmcfoL2zqOItXoTE5nXnjKE
> pGQFjXm7lGMlHTLtzEf6pblJgRr/abVfj4Qf7++9oq4LrnjXPp5pNqr0zdzpBCji
> hAnLIjZLuQQZnZPT34dJkBsbFX+14bbtVVsfWGKb+jky671zLEIJzhvb07HtqqAR
> YuyC6JItQwDpIj8/9gf7DyDmh8Yv4qqb9m0+96PHmTz5Y+NZWivnD8VmF+n5nyHX
> m4iyGCumcppcUrlFZedivXjVlavvvn2bQCymOp9w+cNga+usFVTs0y6otEZBJKWB
> jDXhMwhsZzo2O7mjk0mlv6qruhzIyPuqeXat5zWsiXGeHoPSYNfgTB1odczHK9/S
> XUdGEHS18KbvzWsyVvUaM2GAM+ZB2r2F9S3qxwyvL0LTQ2OrKKTiQFZVd2NUCN9I
> 4O7gn6yUC1pqxiot4fhd7h49N0Fe2bSKk71pZ1OJADo/4hM3uPXewceS/hq65cU6
> L2ezHUBbQPSeaA1kAvcJ
> =f3Iz
> -----END PGP SIGNATURE-----
> _______________________________________________
> R-SIG-Debian mailing list
> R-SIG-Debian at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-debian



More information about the R-SIG-Debian mailing list