[R] trouble for parsing HTML files
Milan Bouchet-Valat
nalimilan at club.fr
Fri Mar 23 18:51:49 CET 2012
Le vendredi 23 mars 2012 à 08:10 +0100, Julien Velcin a écrit :
> Here it is:
>
> R version 2.14.2 (2012-02-29)
> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
I guess the OS uses a French locale? Maybe the discrepancy between R
locale and the OS's is the problem. Can you try with a French locale?
This would be strange, because UTF-8 should be the same in both
settings, but still worth a try...
Else, please do this and post the output, just in case:
url <- "http://www.huffingtonpost.com/social/GraniteSkyline?action=fans"
lines <- readLines(url)
head(lines)
library(tools)
showNonASCII(head(lines))
Hope this helps
More information about the R-help
mailing list