RE: Solr performance issue on indexing

2017-04-04 Thread Allison, Timothy B.
> Also we will try to decouple tika to solr. +1 -Original Message- From: tstusr [mailto:ulfrhe...@gmail.com] Sent: Friday, March 31, 2017 4:31 PM To: solr-user@lucene.apache.org Subject: Re: Solr performance issue on indexing Hi, thanks for the feedback. Yes, it is about OOM, ind

Re: Solr performance issue on indexing

2017-03-31 Thread Erick Erickson
If, by chance, the docs you're sending get routed to different Solr nodes then all the processing is in parallel. I don't know if there's a good way to insure that the docs get sent to different replicas on different Solr instances. You could try addressing specific Solr replicas, something like "b

Re: Solr performance issue on indexing

2017-03-31 Thread tstusr
Hi, thanks for the feedback. Yes, it is about OOM, indeed even solr instance makes unavailable. As I was saying I can't find more relevant information on logs. We're are able to increment JVM amout, so, the first thing we'll do will be that. As far as I know, all documents are bounded to that am

Re: Solr performance issue on indexing

2017-03-31 Thread Erick Erickson
First, running multiple threads with PDF files to a Solr running 4G of JVM is...ambitious. You say it crashes; how? OOMs? Second while the extracting request handler is a fine way to get up and running, any problems with Tika will affect Solr. Tika does a great job of extraction, but there are so