[R] questions on French characters in plot

Milan Bouchet-Valat nalimilan at club.fr
Tue Dec 11 12:11:18 CET 2012


Le mardi 11 décembre 2012 à 01:10 +0100, Richard Zijdeman a écrit :
> Dear all,
> 
> I have imported a dataset from Stata using the foreign package. The
> original data contain French characters such as  and  .
> After importing, string variables containing names of French
> departments have changed. E.g. Ardche became Ard\x8fche. I would like
> to ask how I could plot these changed strings, since now the strings
> with special characters fail to be printed in the plot (either using
> plot() or ggplot2()).
> 
> I have googled for solutions, but actually find it hard to determine
> whether I should change my R setup or should read in the data in a
> different way. Since I work on a mac I changed my local according to
> the R for Mac OS X FAQ, chapter 9.  Below is some info on my setup and
> code and output on what works for me and what does not. Thank you in
> advance for you comments.
Accentuated characters should work fine on a machine using a UTF-8
locale as yours. I think the problem is that the imported data uses
ISO8859-15 or UTF-16, not UTF-8.

I have no idea whether .dta files specify an encoding or not, but I
think you can convert them in R by calling
iconv(department, "ISO-8859-15", "UTF-8")
or
iconv(department, "UTF-16", "UTF-8")

> Best,
> 
> Richard
> 
> #--------------
> rm(list=ls())
> sessionInfo()
> # R version 2.15.2 (2012-10-26)
> # Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> #
> # locale:
> # [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> 
> # creating variables
> department  <- c("Nord","Paris","Ard\x8fche")
\x8 does not correspond to "è" AFAIK. In ISO8859-1 and -15 and UTF-16,
it's \xE8 ("\uE8" in R).

In UTF-8, it's C3 A8, "\303\250" in R.

> department2 <- c("Nord", "Paris", "Ardche")
> n           <- c(2,4,1)
> 
> # creating dataframes
> df  <- data.frame(department,n)
> df2 <- data.frame(department2,n)
> 
> department
> # [1] "Nord"       "Paris"      "Ard\x8fche"
> department2
> # [1] "Nord"    "Paris"   "Ardche"
> 
> plot(df) # fails to show the text "Ardche"
> plot(df2) # shows text "Ardche"
> 
> # EOF
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list