[R] prevent XML::readHTMLTable from suppressing <br/>

Spencer Graves @pencer@gr@ve@ @end|ng |rom e||ect|vede|en@e@org
Sat Jul 25 05:59:55 CEST 2020

Hello, All:

       Thanks to Rasmus Liland, William Michels, and Luke Tierney with 
my earlier web scraping question.  With their help, I've made progress.  
Sadly, I still have a problem:  One field has "<br/>", which gets 
suppressed by XML::readHTMLTable:

sosURL <- 
sosChars <- RCurl::getURL(sosURL)
MOcan <- XML::readHTMLTable(sosChars)
MOcan[[2]][1, 2]
[1] "4476 FIVE MILE RDSENECA MO 64865"

(Seneca <- regexpr('SENECA', sosChars))
substring(sosChars, Seneca-22, Seneca+14)

[1] "4476 FIVE MILE RD<br/>SENECA MO 64865"

       How can I get essentially the same result but without having 
XML::readHTMLTable suppress "<br/>"?

NOTE:  I get something very similar with xml2::read_html and 

sosPointers <- xml2::read_html(sosChars)
MOcan2 <- rvest::html_table(sosPointers)
MOcan2[[2]][1, 2]
[1] "4476 FIVE MILE RDSENECA MO 64865"

       MOcan2 does not have names, and some of the fields are 
automatically converted to integers, which I think is not smart in this 

       Spencer Graves

More information about the R-help mailing list