[R] translating HTML character entities to accented characters

Michael Friendly friendly at yorku.ca
Fri Aug 10 18:14:46 CEST 2012


I've imported a .csv file where character strings that contained 
accented characters were written as HTML
character entities.  Is there a function that works on a vector to 
translate them back to accented (latin1) characters?

Some examples:

 > grep("&", author$lname, value=TRUE)
[1] "Frère de Montizon" "Lumière"
[3] "Lumière"           "Niépce"
[5] "Süssmilch"           "Schüpbach"
 > grep("&", author$birthplace, value=TRUE)
[1] "Marbach, Württemberg"
[2] "Côte-d'Or"
[3] "Chalon-sur-Saône, Saône-et-Loire"
[4] "Groß Särchen, Germany"
 > apropos("HTML")

thx,
-Michael

-- 
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA



More information about the R-help mailing list