[R] Downloading tab separated data from internet

HC hcatbr at yahoo.co.in
Sat Dec 3 05:47:36 CET 2011


Hi all,

I am trying to download some tab separated data from the internet. The data
is not available directly at the URL that could be known apriori. There is
an intermediate form where start and end dates have to be given to get to
the required page.

For example, I want to download data for a station 03015795. The form for
this station is at:

http://ida.water.usgs.gov/ida/available_records.cfm?sn=03015795

I could get the start date and end date from this form using:

# 
# Specifying station and reading from the opening form
stn<-"03015795"
myurl<-paste("http://ida.water.usgs.gov/ida/available_records.cfm?sn=",stn,sep="")
mypage1 = readLines(myurl)

# Getting the start and end dates
mypattern = '<td align="center">([^<]*)</td>'
datalines = grep(mypattern, mypage1[124], value=TRUE)
getexpr = function(s,g)substring(s,g,g+attr(g,'match.length')-1)
gg = gregexpr(mypattern, datalines)
matches = mapply(getexpr,datalines,gg)
result = gsub(mypattern,'\\1',matches)
names(result)=NULL
mydates<-result[1:2]

I want to know how I can feed these start and end dates to the form and
execute the button to go to the data page and then to download the data,
either as displayed in the browser or by saving as a file.

Any help on this is most appreciated.

Thanks.
HC







--
View this message in context: http://r.789695.n4.nabble.com/Downloading-tab-separated-data-from-internet-tp4152318p4152318.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list