subject:"Re\: Problem with SolrJ and indexing PDF files"

Re: Problem with SolrJ and indexing PDF files

2019-05-19 Thread Erick Erickson

Here’s a skeletal program to get you started using Tika directly in a SolrJ client, with a long explication of why using Solr’s extracting request handler is probably not what you want to do in production: https://lucidworks.com/2012/02/14/indexing-with-solrj/ SolrServer was renamed SolrClient

Re: Problem with SolrJ and indexing PDF files

2019-05-19 Thread Jörn Franke

You can use the Tika library to parse the PDFs and then post the text to the Solr servers > Am 19.05.2019 um 11:02 schrieb Mareike Glock > : > > Dear Solr Team, > > I am trying to index Word and PDF documents with Solr using SolrJ, but most > of the examples I found on the internet use the So