The answer is yes, and does not seem like a big step from where you are now, so seeing what you already know how to do (reproducible example, or RE) would help focus the assistance. There are quite a few ways to do this kind of thing, and what you already know would be clarified with a RE. -- Sent from my phone. Please excuse my brevity.
On February 22, 2017 2:52:55 PM PST, henrique monte <henrique.mont...@gmail.com> wrote: >Sometimes I need to get some data from the web organizing it into a >dataframe and waste a lot of time doing it manually. I've been trying >to >figure out how to optimize this proccess, and I've tried with some R >scraping approaches, but couldn't get to do it right and I thought >there >could be an easier way to do this, can anyone help me out with this? > >Fictional example: > >Here's a webpage with countries listed by continents: >https://simple.wikipedia.org/wiki/List_of_countries_by_continents > >Each country name is also a link that leads to another webpage >(specific of >each country, e.g. https://simple.wikipedia.org/wiki/Angola). > >I would like as a final result to get a data frame with number of >observations (rows) = number of countries listed and 4 variables >(colums) >as ID=Country Name, Continent=Continent it belongs to, >Language=Official >language (from the specific webpage of the Countries) and Population = >most >recent population count (from the specific webpage of the Countries). > >... > >The main issue I'm trying to figure out is handling several webpages, >like, >would it be possible to scrape from the first link of the problem the >countries as a list with the links of the countries webpages and then >create and run a function to run a scraping command in each of those >links >from the list to get the specific data I'm looking for? > > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.