If, by chance, the docs you're sending get routed to different Solr nodes then all the processing is in parallel. I don't know if there's a good way to insure that the docs get sent to different replicas on different Solr instances. You could try addressing specific Solr replicas, something like "blah blah/solr/collection1_shard1_replica1/export" but I'm not totally sure that'll do what you want either.
But that still doesn't decouple Tika from the Solr instances running those replicas. So if Tika has a problem it has the potential to bring the Solr node down. Best, Erick On Fri, Mar 31, 2017 at 1:31 PM, tstusr <ulfrhe...@gmail.com> wrote: > Hi, thanks for the feedback. > > Yes, it is about OOM, indeed even solr instance makes unavailable. As I was > saying I can't find more relevant information on logs. > > We're are able to increment JVM amout, so, the first thing we'll do will be > that. > > As far as I know, all documents are bounded to that amount (14K), just the > processing could change. We are making some tests on indexing and it seems > it works without concurrent threads. Also we will try to decouple tika to > solr. > > By the way, make it available with solr cloud will improve performance? Or > there will be no perceptible improvement? > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-performance-issue-on-indexing-tp4327886p4327914.html > Sent from the Solr - User mailing list archive at Nabble.com.