On 03/30/2016 08:23 AM, Jostein Elvaker Haande wrote:
On 30 March 2016 at 12:25, Markus Jelsma <markus.jel...@openindex.io> wrote:
Hello - with TieredMergePolicy and default reclaimDeletesWeight of 2.0, and 
frequent updates, it is not uncommon to see a ratio of 25%. If you want deletes 
to be reclaimed more often, e.g. weight of 4.0, you will see very frequent 
merging of large segments, killing performance if you are on spinning disks.

Most of our installations are on spinning disks, so if I want a more
aggressive reclaim, this will impact performance. This is of course
something that I do not desire, so I'm wondering if scheduling a
commit with 'expungeDeletes' during off peak business hours is a
better approach than setting up a more aggressive merge policy.


As far as my experimentation with @expungeDeletes goes, if the data you indexed and committed using @expungeDeletes didn't touch segments with any deleted documents nor wasn't enough data to cause merging with a segment containing deleted documents, no deleted documents will be removed. Basically, @expungeDeletes expunges deletes in segments affected by the commit. If you have a large update that touches many segments containing deleted documents and you use @expungeDeletes, it could be just as resource intensive as an optimize.

My setting for reclaimDeletesWeight:
  <double name="reclaimDeletesWeight">5.0</double>

It keeps the deleted documents down to ~ 10% without any noticable impact on resources or performance. But I'm still in the testing phase with this setting.

Reply via email to