Thanks a lot thats exactly what I was looking for Just a quick question I agree the form gets submitted to the URL "http://www.nseindia.com/marketinfo/indices/histdata/historicalindices.jsp"
and I am filling up the form in the page "http://www.nseindia.com/content/indices/ind_histvalues.htm" How do I submit the arguments like FromDate, ToDate, Symbol using postForm() and submit the query to get the similar table. On Fri, Nov 5, 2010 at 6:43 AM, Duncan Temple Lang <dun...@wald.ucdavis.edu>wrote: > > > On 11/4/10 2:39 AM, sayan dasgupta wrote: > > Hi RUsers, > > > > Suppose I want to see the data on the website > > url <- "http://www.nseindia.com/content/indices/ind_histvalues.htm" > > > > for the index "S&P CNX NIFTY" for > > dates "FromDate"="01-11-2010","ToDate"="02-11-2010" > > > > then read the html table from the page using readHTMLtable() > > > > I am using this code > > webpage <- postForm(url,.params=list( > > "FromDate"="01-11-2010", > > "ToDate"="02-11-2010", > > "IndexType"="S&P CNX NIFTY", > > "Indicesdata"="Get Details"), > > .opts=list(useragent = getOption("HTTPUserAgent"))) > > > > But it doesn't give me desired result > > You need to be more specific about how it fails to give the desired result. > > You are in fact posting to the wrong URL. The form is submitted to a > different > URL - > http://www.nseindia.com/marketinfo/indices/histdata/historicalindices.jsp > > > > > > > Also I was trying to use the function getHTMLFormDescription from the > > package RHTMLForms but there we can't use the argument > > .opts=list(useragent = getOption("HTTPUserAgent")) which is needed for > this > > particular website > > That's not the case. The function RHTMLForms will generate for you does > support > the .opts parameter. > > What you want is something along the lines: > > > # Set default options for RCurl > # requests > options(RCurlOptions = list(useragent = "R")) > library(RCurl) > > # Read the HTML page since we cannot use htmlParse() directly > # as it does not specify the user agent or an > # Accept:*.* > > url <- "http://www.nseindia.com/content/indices/ind_histvalues.htm" > wp = getURLContent(url) > > # Now that we have the page, parse it and use the RHTMLForms > # package to create an R function that will act as an interface > # to the form. > library(RHTMLForms) > library(XML) > doc = htmlParse(wp, asText = TRUE) > # need to set the URL for this document since we read it from > # text, rather than from the URL directly > > docName(doc) = url > > # Create the form description and generate the R > # function "call" the > > form = getHTMLFormDescription(doc)[[1]] > fun = createFunction(form) > > > # now we can invoke the form from R. We only need 2 > # inputs - FromDate and ToDate > > o = fun(FromDate = "01-11-2010", ToDate = "04-11-2010") > > # Having looked at the tables, I think we want the the 3rd > # one. > table = readHTMLTable(htmlParse(o, asText = TRUE), > which = 3, > header = TRUE, > stringsAsFactors = FALSE) > table > > > > > Yes it is marginally involved. But that is because we cannot simply read > the HTML document directly from htmlParse() because the lack of Accept(& > useragent) > HTTP header. > > > > > > > Thanks and Regards > > Sayan Dasgupta > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.