[R] help with restart
Wincent
ronggui.huang at gmail.com
Wed May 5 10:34:30 CEST 2010
Dear all, I want to download webpage from a large number of webpage.
For example,
########
link <- c("http://gzbbs.soufun.com/board/2811006802/",
"http://gzbbs.soufun.com/board/2811328226/",
"http://gzbbs.soufun.com/board/2811720258/",
"http://gzbbs.soufun.com/board/2811495702/",
"http://gzbbs.soufun.com/board/2811176022/",
"http://gzbbs.soufun.com/board/2811866676/"
)
# the actual vector will be much longer.
ans <- vector("list",length(link))
for (i in seq_along(link)){
ans[[i]] <- readLines(url(link[i]))
Sys.sleep(8)
}
#######
The problem is, the sever will not response if the retrieval happens
too often and I don't know what the optimal time span between two
retrieval.
When the sever does not response to readLines, it will return an error
and stop. What I want to do is: when an error occurs, I put R to sleep
for say 60 seconds, and redo the readLines on the same link.
I did some search and guess withCallingHandlers and withRestarts will
do the trick. Yet, I didn't find much example on the usage of them.
Can you give me some suggestions? Thanks.
--
Wincent Rong-gui HUANG
Doctoral Candidate
Dept of Public and Social Administration
City University of Hong Kong
http://asrr.r-forge.r-project.org/rghuang.html
More information about the R-help
mailing list