On 10/22/2019 1:12 PM, Nicolas Paris wrote:
We, at Auto-Suggest, also do atomic updates daily and specifically
changing merge factor gave us a boost of ~4x

Interesting. What kind of change exactly on the merge factor side ?

The mergeFactor setting is deprecated. Instead, use maxMergeAtOnce, segmentsPerTier, and a setting that is not mentioned in the ref guide -- maxMergeAtOnceExplicit.

Set the first two to the same number, and the third to a minumum of three times what you set the other two.

The default setting for maxMergeAtOnce and segmentsPerTier is 10, with 30 for maxMergeAtOnceExplicit. When you're trying to increase indexing speed and you think segment merging is interfering, you want to increase these values to something larger. Note that increasing these values will increase the number of files that your Solr install keeps open.

https://lucene.apache.org/solr/guide/8_1/indexconfig-in-solrconfig.html#mergepolicyfactory

When I built a Solr setup, I increased maxMergeAtOnce and segmentsPerTier to 35, and maxMergeAtOnceExplicit to 105. This made merging happen a lot less frequently.

Would you say atomical update is faster than regular replacement of
documents ? (considering my first thought on this below)

On the Solr side, atomic updates will be slightly slower than indexing the whole document provided to Solr. When an atomic update is done, Solr will find the existing document, then combine what's in that document with the changes you specify using the atomic update, and then index the whole combined document as a new document that replaces with original.

Whether or not atomic updates are faster or slower in practice than indexing the whole document will depend on how your source systems work, and that is not something we can know. If Solr can access the previous document faster than you can get the document from your source system, then atomic updates might be faster.

Thanks,
Shawn

Reply via email to