On 6/2/2019 4:35 PM, John Davis wrote:
If we assume there is no query load then effectively this boils down to
most effective way for adding a large number of documents to the solr
index. I've looked through SolrJ, DIH and others -- is the bottomline
across all of them to "batch updates" and not commit as long as possible?

If you want the maximum indexing speed, you'll need to batch updates and send multiple batches in parallel. I cannot tell you how much concurrency you need, you'll have to experiment. I would probably start at the same number of threads as you have CPU cores in your Solr server, and then try 1.5 times that, and 2 times that, see which works better. I'd even try 3 or 4 times the CPU count, just to see how it behaves.

As long as commits are not happening in rapid succession, I wouldn't worry too much about those interfering with indexing speed. Commits that don't open a searcher probably should be no more frequent than every minute or two, commits that DO open a new searcher should be less frequent than that.

Thanks,
Shawn

Reply via email to