I think a lot would depend on exactly how the data is formatted. I
have used 'pdf2text' converters (many freely available on the web) to
convert to text and then use R to read-in/preprocess the data to get
it into a format to process.
You can invoke these converter with the 'system' function and
All,
Is anyone familiar with a way to use R to read table data from a large
collection of PDF files? I'm aware there are various command lines and desktop
utilities that might be able to (e.g.,) dump PDFs to text, which could then be
parsed for table data. But I'm hoping there is something more
2 matches
Mail list logo