On 10/22/2019 1:12 PM, Nicolas Paris wrote:
We, at Auto-Suggest, also do atomic updates daily and specifically
changing merge factor gave us a boost of ~4x
Interesting. What kind of change exactly on the merge factor side ?
The mergeFactor setting is deprecated. Instead, use maxMergeAtOnce,
segmentsPerTier, and a setting that is not mentioned in the ref guide --
maxMergeAtOnceExplicit.
Set the first two to the same number, and the third to a minumum of
three times what you set the other two.
The default setting for maxMergeAtOnce and segmentsPerTier is 10, with
30 for maxMergeAtOnceExplicit. When you're trying to increase indexing
speed and you think segment merging is interfering, you want to increase
these values to something larger. Note that increasing these values
will increase the number of files that your Solr install keeps open.
https://lucene.apache.org/solr/guide/8_1/indexconfig-in-solrconfig.html#mergepolicyfactory
When I built a Solr setup, I increased maxMergeAtOnce and
segmentsPerTier to 35, and maxMergeAtOnceExplicit to 105. This made
merging happen a lot less frequently.
Would you say atomical update is faster than regular replacement of
documents ? (considering my first thought on this below)
On the Solr side, atomic updates will be slightly slower than indexing
the whole document provided to Solr. When an atomic update is done,
Solr will find the existing document, then combine what's in that
document with the changes you specify using the atomic update, and then
index the whole combined document as a new document that replaces with
original.
Whether or not atomic updates are faster or slower in practice than
indexing the whole document will depend on how your source systems work,
and that is not something we can know. If Solr can access the previous
document faster than you can get the document from your source system,
then atomic updates might be faster.
Thanks,
Shawn