[R] translating HTML character entities to accented characters

Michael Friendly friendly at yorku.ca
Fri Aug 10 19:13:58 CEST 2012


Thanks, David

I need an all-R solution for this, because the author.csv file is 
exported from a database that enforces the HTML
encoding and the import into R may have to be repeated several times as 
the database is updated.

-Michael

On 8/10/2012 12:40 PM, David L Carlson wrote:
> It's not quite an R solution, but I just pasted your examples into a script
> window in R and saved it as chars.html. Then I opened it in Firefox and
> pasted the results here (with returns inserted to match your original).
>
>> grep("&", author$lname, value=TRUE)
> [1] "Frère de Montizon" "Lumière"
> [3] "Lumière" "Niépce"
> [5] "Süssmilch" "Schüpbach"
>> grep("&", author$birthplace, value=TRUE)
> [1] "Marbach, Württemberg"
> [2] "Côte-d'Or"
> [3] "Chalon-sur-Saône, Saône-et-Loire"
> [4] "Groß Särchen, Germany"
>> apropos("HTML")
> For a CSV file you would want to preserve the lines by adding <br> to the
> end of each line first.
>
> ----------------------------------------------
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of Michael Friendly
>> Sent: Friday, August 10, 2012 11:15 AM
>> To: R-help
>> Subject: [R] translating HTML character entities to accented characters
>>
>> I've imported a .csv file where character strings that contained
>> accented characters were written as HTML
>> character entities.  Is there a function that works on a vector to
>> translate them back to accented (latin1) characters?
>>
>> Some examples:
>>
>>   > grep("&", author$lname, value=TRUE)
>> [1] "Frère de Montizon" "Lumière"
>> [3] "Lumière"           "Niépce"
>> [5] "Süssmilch"           "Schüpbach"
>>   > grep("&", author$birthplace, value=TRUE)
>> [1] "Marbach, Württemberg"
>> [2] "Côte-d'Or"
>> [3] "Chalon-sur-Saône, Saône-et-Loire"
>> [4] "Groß Särchen, Germany"
>>   > apropos("HTML")
>>
>> thx,
>> -Michael
>>
>> --
>> Michael Friendly     Email: friendly AT yorku DOT ca
>> Professor, Psychology Dept.
>> York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
>> 4700 Keele Street    Web:   http://www.datavis.ca
>> Toronto, ONT  M3J 1P3 CANADA
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.


-- 
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA



More information about the R-help mailing list