[R] Parsing of HTML files in R

Duncan Temple Lang duncan at research.bell-labs.com
Fri Oct 26 16:56:09 CEST 2001


Hi Luis,

 I just uploaded the latest version of the XML package
to the Omegahat web site at
  http://www.omegahat.org/RSXML/XML_0.7-0.tar.gz
and this now has support for parsing HTML.

I only added support for the DOM style of parsing, i.e.
reading the entire tree and then applying R functions
to convert it. Hopefully that will be enough to suit
your needs.

Please let me know if there are any problems with the package.

 Thanks for the suggestion to include HTML support.

 Duncan.

Luis Torgo wrote:
> Is there any package similar to the XML package that is able to
> "extract" relevant information from HTML files. Namely, I'm interested
> in obtained data that is represented as a HTML table, into some R-type
> structure.
> Thank you.
> 
> --
> Luis Torgo
>     FEP/LIACC, University of Porto   Phone : (+351) 22 607 88 30
>     Machine Learning Group           Fax   : (+351) 22 600 36 54
>     R. Campo Alegre, 823             email : ltorgo at liacc.up.pt
>     4150 PORTO   -  PORTUGAL         WWW   : http://www.liacc.up.pt/~ltorgo
> 
> 
> 
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

-- 
_______________________________________________________________

Duncan Temple Lang                duncan at research.bell-labs.com
Bell Labs, Lucent Technologies    office: (908)582-3217
700 Mountain Avenue, Room 2C-259  fax:    (908)582-3340
Murray Hill, NJ  07974-2070       
         http://cm.bell-labs.com/stat/duncan
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list