You are talking about dig the data from a dynamic webpage. The data displayed to you, I guess, is fetched via filtering from certain database. And the dropdowns you saw in the page must be some sort of widgets to do these filtering.
Some sites offer this filtering via URL parsing, where the final URL changes along with used filters. But other sites might only offer those data via embeded widgets and does no change to the URL you see. Maybe your case beongs to the second type. Maybe you can do some analysis to the source of that webpage. If you are lucky, you can find some codes dealing with the filering job. They might offer some help. :D 2011/1/5 Mike Marchywka <marchy...@hotmail.com> > > > > > Date: Tue, 4 Jan 2011 10:54:19 -0800 > > From: egregory2...@yahoo.com > > To: r-help@r-project.org > > Subject: [R] Navigating web pages using R > > > > R-Help, > > > > I'm trying to obtain some data from a webpage which masks the URL from > the user, > > so an explicit URL will not work. For example, when one navigates to the > web > > page the URL looks something like: > > http://137.113.141.205/rpt34s.php?flags=1 (changed for privacy, but i'm > not sure > > you could access it anyways since it's internal to the agency I work > for). > > LOL, presuming you are not a disgruntled employee, it is always amusing to > see some entity with a fancy cryptic web design drink their own Koolaid :) > This is the most annoying kind of code to write, especially when there is > no reason such as revenue model to make it hard to get. I've posted in > other > forums about the general need for an API if you are providing data to > others > in a non-hostile setting. > > > > The site has three drop-down menus for "Site", "Month," and "Year". When > a > > combination is selected of these, the resulting URL is > > always http://137.113.141.205/rpt34s (nothing changes, except "flags=1" > is > > dropped, so what I need to be able to do is write something that will > navigate > > to the original URL, then select some combination of "Site", "Month", and > > "Year," and then submit the query to the site to navigate to the page > with the > > data. > > Is this a capability that R has as a language? Unfortunately, I'm > unfamiliar > > with html or php programming, so if this question belongs in a forum on > that I > > apologize. I'm trying to centralize all of my code for my analysis in R! > > I'm sure that ultimately you can code this in R but for digging out what > you need there may be better approaches. > First I would try to contact the page author or determine if there is > a better way to get the same data. Failing that, you may be able to find > a "form" section in the html and copy that. Firefox is supposed to have > something > called "firebug" to let you see what the page does but I've never actually > used > that. Generally I use linux or cygwin command line tools to diagnose this > junk, > R may support some of these features but this is a common issue outside of > R too > and so it may be worth while learning the other tools. If all else fails, > downloading > a local copy of the page etc, you may be able to do a packet capture and > just > see what it does by brute force. > > >From what I have seen, the R tools are pretty much named after the linux > tools, > curl for example. > > > > > > > Thank you, > > -Erik Gregory > > Student Assistant, California EPA > > CSU Sacramento, Mathematics > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.