subject:"Problem with SolrJ and indexing PDF files"

Re: Problem with SolrJ and indexing PDF files

2019-05-19 Thread Erick Erickson

Here’s a skeletal program to get you started using Tika directly in a SolrJ client, with a long explication of why using Solr’s extracting request handler is probably not what you want to do in production: https://lucidworks.com/2012/02/14/indexing-with-solrj/ SolrServer was renamed SolrClient

Re: Problem with SolrJ and indexing PDF files

2019-05-19 Thread Jörn Franke

You can use the Tika library to parse the PDFs and then post the text to the Solr servers > Am 19.05.2019 um 11:02 schrieb Mareike Glock > : > > Dear Solr Team, > > I am trying to index Word and PDF documents with Solr using SolrJ, but most > of the examples I found on the internet use the So

Problem with SolrJ and indexing PDF files

2019-05-19 Thread Mareike Glock

Dear Solr Team, I am trying to index Word and PDF documents with Solr using SolrJ, but most of the examples I found on the internet use the SolrServer class which I guess is deprecated. The connection to Solr itself is working, because I can add SolrInputDocuments to the index but it does not