subject:"Using Solr Cell to index the internal structure of a PDF"

Re: Using Solr Cell to index the internal structure of a PDF

2013-10-10 Thread Furkan KAMACI

You can have a look here: http://solr.pl/en/2011/04/04/indexing-files-like-doc-pdf-solr-and-tika-integration/ 2013/10/10 Peter Bleackley > I'm trying to index a set of PDF documents with Solr 4.5.0. So far I can > get Solr to ingest the entire document as one long string, stored in the > index

Using Solr Cell to index the internal structure of a PDF

2013-10-10 Thread Peter Bleackley

I'm trying to index a set of PDF documents with Solr 4.5.0. So far I can get Solr to ingest the entire document as one long string, stored in the index as "content". However, I want to index structure within the documents. I know that the ExtractingRequestHandler uses Apache Tika to convert the