[R] questions on French characters in plot
Milan Bouchet-Valat
nalimilan at club.fr
Tue Dec 11 12:11:18 CET 2012
Le mardi 11 décembre 2012 à 01:10 +0100, Richard Zijdeman a écrit :
> Dear all,
>
> I have imported a dataset from Stata using the foreign package. The
> original data contain French characters such as and .
> After importing, string variables containing names of French
> departments have changed. E.g. Ardche became Ard\x8fche. I would like
> to ask how I could plot these changed strings, since now the strings
> with special characters fail to be printed in the plot (either using
> plot() or ggplot2()).
>
> I have googled for solutions, but actually find it hard to determine
> whether I should change my R setup or should read in the data in a
> different way. Since I work on a mac I changed my local according to
> the R for Mac OS X FAQ, chapter 9. Below is some info on my setup and
> code and output on what works for me and what does not. Thank you in
> advance for you comments.
Accentuated characters should work fine on a machine using a UTF-8
locale as yours. I think the problem is that the imported data uses
ISO8859-15 or UTF-16, not UTF-8.
I have no idea whether .dta files specify an encoding or not, but I
think you can convert them in R by calling
iconv(department, "ISO-8859-15", "UTF-8")
or
iconv(department, "UTF-16", "UTF-8")
> Best,
>
> Richard
>
> #--------------
> rm(list=ls())
> sessionInfo()
> # R version 2.15.2 (2012-10-26)
> # Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> #
> # locale:
> # [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> # creating variables
> department <- c("Nord","Paris","Ard\x8fche")
\x8 does not correspond to "è" AFAIK. In ISO8859-1 and -15 and UTF-16,
it's \xE8 ("\uE8" in R).
In UTF-8, it's C3 A8, "\303\250" in R.
> department2 <- c("Nord", "Paris", "Ardche")
> n <- c(2,4,1)
>
> # creating dataframes
> df <- data.frame(department,n)
> df2 <- data.frame(department2,n)
>
> department
> # [1] "Nord" "Paris" "Ard\x8fche"
> department2
> # [1] "Nord" "Paris" "Ardche"
>
> plot(df) # fails to show the text "Ardche"
> plot(df2) # shows text "Ardche"
>
> # EOF
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list