On Sun, Sep 18, 2011 at 7:44 AM, Victor <vdem...@gmail.com> wrote: > Unfortunately pdf2text doesn't seem to exist either in linux or mac osx.
I think Jeff's main point was to search for software specific for your task (convert a pdf to text). Formatting will be lost so once you get your text files, I would look at regular expressions to try to find the right part of text to grab. Some general functions that seem like they might be relevant: ## for getting the text into R ?readLines ?scan ## for finding the part you need ?regexp ?grep Cheers, Josh > Ciao Vittorio > > Il giorno 17/set/2011, alle ore 21:00, Jeff Newmiller ha scritto: > >> Doesn't seen like an R task, but see pdf2text? (From pdftools, UNIX command >> line tools) >> --------------------------------------------------------------------------- >> Jeff Newmiller The ..... ..... Go Live... >> DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... >> Live: OO#.. Dead: OO#.. Playing >> Research Engineer (Solar/Batteries O.O#. #.O#. with >> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k >> --------------------------------------------------------------------------- >> Sent from my phone. Please excuse my brevity. >> >> Victor <vdem...@gmail.com> wrote: >> In an R script I need to extract some figures from many web pages in pdf >> format. As an example see >> http://www.terna.it/LinkClick.aspx?fileticket=TTQuOPUf%2fs0%3d&tabid=435&mid=3072 >> from which I would like to extract the "Totale: 1,025,823"). >> Is there any solution? >> Ciao >> Vittorio >> >> >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.