[R] readHTML within tm package

Peter O'Donnell peter.odonnell at infotech.monash.edu.au
Fri Dec 11 04:57:12 CET 2009


I'm hoping to work with the tm package with some html documents. In the
documentation and in the the tutorial material it says that there is a
readHTML routine that can be used to read HTML documents into a corpus.
However, when I try to use that routine I get an error. When I run
getReaders (below) readHTML isn't listed.

> getReaders()
[1] "readDOC"                 "readGmane"              
[3] "readPDF"                 "readReut21578XML"       
[5] "readReut21578XMLasPlain" "readPlain"              
[7] "readRCV1"                "readTabular"  

I'm a missing something? Is there an extra install I'm missing, or has the
routine been removed or replaced?

Thanks, Peter
Oh, yes, running the latest R release on Mac OS 10.6.2
-- 
View this message in context: http://n4.nabble.com/readHTML-within-tm-package-tp960778p960778.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list