Hello, need some help on using Rcurl to navigate a site and the use of session cookies. I suspect the issue i am having presently is I am not handling session cookies properly. At a high level, in need to create a dataset for some analysis, my background is in using R for statistical work, I am very inexperienced in HTTP and XML type of coding. Basically, like to use R for a mashup project I have. Navigate to a web site, login, query some data, clean the data and create a date frame, navigate to another site, run some queries and append to the data frame.

I have determined that RCurl has all the necessary power to do the navigation and form submission, but I am struggling to get this to work. I have read the help articles around RCurl, but after days of trying, hitting a wall.
Code so far:

library(RCurl)
library(XML)


txt2 <- postForm("http://www.dailyreportonline.com/siteLogin.asp";,
      origin = "",
      queryDB = "",
      form_username = "zubin",
      form_password = "xxxx",
      form_save_login = "on",
      login = "Submit")

htmlTreeParse(txt2, asText = TRUE)


This successfully navigates to the site, but its not submitting the form information and logging in, something is not completely correct. I contacted an expert and they indicated that most likely I am not handling session cookies properly.

Does someone have example RCurl code that submits a form to a site using session cookies, keeps a session open and then performs a sequence of operations? I think that may help me learn what i need to do. Rcurl seems very powerful. I will need to keep a session open as i login, navigate, submit another form within the site and retrieve data.

I most likely may need some formal help, so any students familiar with HTTP, XML, and R wanting to earn some money, please contact me.

-zubin

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to