Glenn,

R Studio did a webinar on Web Scraping using the rvest package that made it 
look really easy.  I haven't gotten around to using it yet, but the video 
should be on their website somewhere.  The link below is the PDF of the slides. 
 It should be education and will probably give you what you need to know to get 
the data you need:

https://github.com/rstudio/webinars/blob/master/32-Web-Scraping/02-Web-Scraping.pdf



-----Original Message-----
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Glenn Schultz
Sent: Monday, December 19, 2016 10:02 AM
To: R Help R
Subject: [R] getting data from a webpage

All,

I was getting data swap rate data from the St. Louis Fed FRED database via the 
FRED API.  ICE stopped reporting to FRED and now I must get the data from the 
ICE website.  I would like to use httr to get the data but I really don't know 
much about website design.  I think the form redirects but I am not sure that 
is the case much less how to identify what website the form redirects to.  I 
used the developer and inspect elements to come up with the below which failed 
miserably.  In addition, I purchase the book Automated Data Collection with R 
which has not been to useful helping me to understand how to navigate pages 
using forms and redirects.

Can anyone provide a good reference to understanding how to get data from 
websites using forms and redirects.  Specifically,

How find the actual webpage that on must submit the POST request.
How to the find the redirected page which really has the data.

Best,
Glenn

library(httr)
#get initial cookies
h <- handle("https://www.theice.com/";)
GET(handle = h)
POST(url = "https://www.theice.com/marketdata/reports/180";,
body = list(reportDate = "15-Dec-2016", SeriesNameAnRunCode_chosen = "USD Rates 
1100"), encode = "form", handle = h) page <- GET(url= 
"https://www.theice.com/marketdata/reports/icebenchmarkadmin/ISDAFIXHistoricalRates.shtml";,
handle = h)
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



This message and any attachments are for the intended recipient’s use only. 
This message may contain confidential, proprietary or legally privileged 
information. No right to confidential or privileged treatment of this message 
is waived or lost by an error in transmission.
If you have received this message in error, please immediately notify the 
sender by e-mail, delete the message, any attachments and all copies from your 
system and destroy any hard copies. You must not, directly or indirectly, use, 
disclose, distribute, print or copy any part of this message or any attachments 
if you are not the intended recipient.




______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to