[R] Rcurl, postForm()
Simon Kiss
sjkiss at gmail.com
Mon May 28 21:46:55 CEST 2012
Dear colleagues,
Could I get some assistance using postForm() to scrape the business names and addresses at this website:
http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx
I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic question, but I need some help regardless. Yours, Simon Kiss
library(XML)
library(RCurl)
library(scrapeR)
library(RHTMLForms)
#Set URL
bus<-c('http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx')
#Scrape URL
orig<-getURLContent(url=bus)
#Parse doc
doc<-htmlParse(orig[[1]], asText=TRUE)
#Get The forms
forms<-getNodeSet(doc, "//form")
forms[[1]]
#These are the input nodes
getNodeSet(forms[[1]], ".//input")
#These are the select nodes
getNodeSet(forms[[1]], ".//select")
*********************************
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606
More information about the R-help
mailing list