[R] htmlParse hangs or crashes
sjkiss at gmail.com
Mon Sep 5 23:48:57 CEST 2011
each time I use htmlParse, R crashes or hangs. The url I'd like to parse is included below as is the results of a series of basic commands that describe what I'm experiencing. The results of sessionInfo() are attached at the bottom of the message.
The thing is, htmlTreeParse appears to work just fine, although it doesn't appear to contain the information I need (the URLs of the articles linked to on this search page). Regardless, I'd still like to understand why htmlParse doesn't work.
Thank you for any insight.
#returns "HTMLInternalDocument" "XMLInternalDocument"
*** caught segfault ***
address 0x1398754, cause 'memory not mapped'
1: .Call("RS_XML_dumpHTMLDoc", doc, as.integer(indent), as.character(encoding), as.logical(indent), PACKAGE = "XML")
5: as(x, "character")
6: cat(as(x, "character"), "\n")
7: print.XMLInternalDocument(<pointer: 0x11656d3e0>)
8: print(<pointer: 0x11656d3e0>)
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
R version 2.13.0 (2011-04-13)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
attached base packages:
 stats graphics grDevices utils datasets methods base
other attached packages:
 XML_3.4-0 RCurl_1.5-0 bitops_1.0-4.1
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
Cell: +1 905 746 7606
More information about the R-help