[R] Help with this web scrape function

Sven D sduve at hotmail.com
Fri Jun 1 22:22:19 CEST 2012


Hello,

I am looking to scrape this Webpage: 
http://toast.gasunie.de/gud/search.aspx?soid=GUD&lang=de

The page uses the method "POST", it contains various HTML Forms, mostly
lists and a couple of radio buttons. After submit, I should get forwarded to
a new page. Which selections are being made in the forms does not really
matter, I get quite far, pls see the code:

library(RCurl)
library(RHTMLForms)
library(XML)

pageForms =
getHTMLFormDescription("http://toast.gasunie.de/gud/search.aspx?soid=GUD&lang=de")

fun = createFunction(pageForms[[1]])

retSubmit = fun('ctl00$MainContent$GasQuality' = "H",
'ctl00$MainContent$PointList' = "H071", 'ctl00$MainContent$PointType' =
"EN", 'ctl00$MainContent$Publishers' = "HourValues",
'ctl00$MainContent$ListHourValues' = "-1",
'ctl00_MainContent_webDatePickerFrom_input' = "01.06.2012",
'ctl00_MainContent_webDatePickerTo_input' = "01.06.2012")

retPage = htmlTreeParse(retSubmit, asText = TRUE)
retPage


This is how far I get: All HTML Forms are being selected correctly with the
exception of 'ctl00$MainContent$ListHourValues'. My question is, why is this
function not correctly electing the ListHourValues? Also I think that the
function is actually not submitting the Form, because if one would submit
the Form 'by Hand' without electing the ListHourValues the page would return
MessageBox containg an error.

So the function basically returns the original page with the Forms being
selected to the right values, but it doesnt take the next step to return the
final result. Is it possible that I might have to pack the function
retSubmit into a postForm() function?

Best


Sven




--
View this message in context: http://r.789695.n4.nabble.com/Help-with-this-web-scrape-function-tp4632137.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list