[R] Do colClasses in readHTMLTable (XML Package) work?
Duncan Temple Lang
duncan at wald.ucdavis.edu
Sat Mar 20 14:04:28 CET 2010
On 3/17/10 6:52 PM, Marshall Feldman wrote:
> Hi,
>
> I can't get the colClasses option to work in the readHTMLTable function
> of the XML package. Here's a code fragment:
>
> require("XML")
> doc <- "http://www.nber.org/cycles/cyclesmain.html"
> table <- getNodeSet(htmlParse(doc),"//table") [[2]] # The
> main table is the second one because it's embedded in the page table.
> xt <- readHTMLTable(
> table,
> header =
> c("peak","trough","contraction","expansion","trough2trough","peak2peak"),
> colClasses =
> c("character","character","character","character","character","character"),
> trim = TRUE
> )
>
> Does anyone know what's wrong?
The coercion of the table columns is done before the call to
as.data.frame. You can add
stringsAsFactors = FALSE
in the call to readHTMLTable() and you'll get what you expect,
I believe.
D.
>
> Marsh Feldman
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list