Wet Bell Diver <wetbelldiver <at> gmail.com> writes: > > R-3.0.1, Rstudio, Win7 x64 > > Dear list, > > I would like to download all the webpages of the Journal Citations > Report (science edition), for a given year. I can do so manually, but > that it very time intensive, so I would like to use R for that. > > I have tried many things, including: > download.file(url = > "http://admin-apps.webofknowledge.com/JCR/JCR?RQ=SELECT_ALL&cursor=21", > destfile = "test.htm", method = "internal") > which would get the page starting with journal number 21. > However, test.htm only includes the message: > > >>>
You need to review the RCurl package and look for "cookies", which will allow you (once you have established a session in a browser) to copy the cookies (tokens which allow you access) into your R session. However, you will probably be violating the terms of service of JCR. You should talk to your librarian about this. When I wanted to do a similar project I worked out a system where I generated the URLs automatically and got a student assistant to (efficiently) go to the URLs and paste the results into output files. Ben Bolker ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.