Dear all, I want to download webpage from a large number of webpage. For example,
######## link <- c("http://gzbbs.soufun.com/board/2811006802/", "http://gzbbs.soufun.com/board/2811328226/", "http://gzbbs.soufun.com/board/2811720258/", "http://gzbbs.soufun.com/board/2811495702/", "http://gzbbs.soufun.com/board/2811176022/", "http://gzbbs.soufun.com/board/2811866676/" ) # the actual vector will be much longer. ans <- vector("list",length(link)) for (i in seq_along(link)){ ans[[i]] <- readLines(url(link[i])) Sys.sleep(8) } ####### The problem is, the sever will not response if the retrieval happens too often and I don't know what the optimal time span between two retrieval. When the sever does not response to readLines, it will return an error and stop. What I want to do is: when an error occurs, I put R to sleep for say 60 seconds, and redo the readLines on the same link. I did some search and guess withCallingHandlers and withRestarts will do the trick. Yet, I didn't find much example on the usage of them. Can you give me some suggestions? Thanks. -- Wincent Rong-gui HUANG Doctoral Candidate Dept of Public and Social Administration City University of Hong Kong http://asrr.r-forge.r-project.org/rghuang.html ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.