If, by chance, the docs you're sending get routed to different Solr
nodes then all the processing is in parallel. I don't know if there's
a good way to insure that the docs get sent to different replicas on
different Solr instances. You could try addressing specific Solr
replicas, something like "blah
blah/solr/collection1_shard1_replica1/export" but I'm not totally sure
that'll do what you want either.

 But that still doesn't decouple Tika from the Solr instances running
those replicas. So if Tika has a problem it has the potential to bring
the Solr node down.

Best,
Erick

On Fri, Mar 31, 2017 at 1:31 PM, tstusr <ulfrhe...@gmail.com> wrote:
> Hi, thanks for the feedback.
>
> Yes, it is about OOM, indeed even solr instance makes unavailable. As I was
> saying I can't find more relevant information on logs.
>
> We're are able to increment JVM amout, so, the first thing we'll do will be
> that.
>
> As far as I know, all documents are bounded to that amount (14K), just the
> processing could change. We are making some tests on indexing and it seems
> it works without concurrent threads. Also we will try to decouple tika to
> solr.
>
> By the way, make it available with solr cloud will improve performance? Or
> there will be no perceptible improvement?
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-performance-issue-on-indexing-tp4327886p4327914.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to