Did you _ever_ do a forceMerge/optimize or expungeDeletes? Here's the problem TieredMergePolicy (TMP) has a maximum segment size it will allow, 5G by default. No segment is even considered for merging unless it has < 2.5G (or half whatever the default is) non-deleted docs, the logic being that to merge similar size segments, each has to be less than half the max size.
However, optimize/forceMerge and expungeDeletes do not have a limit on the segment size. So say you optimize at some point and have a 100G segment. It won't get merged until you have 97.5G worth of deleted docs. More here: https://issues.apache.org/jira/browse/LUCENE-7976 Erick On Wed, Oct 4, 2017 at 5:47 AM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Do you mean a periodic forceMerge? That is usually considered a bad habit on > this list (i agree). It is just that i am actually very surprised this can > happen at all with default settings. This factory, unfortunately does not > seem to support settings configured in solrconfig. > > Thanks, > Markus > > -----Original message----- >> From:Amrit Sarkar <sarkaramr...@gmail.com> >> Sent: Wednesday 4th October 2017 14:42 >> To: solr-user@lucene.apache.org >> Subject: Re: Very high number of deleted docs >> >> Hi Markus, >> >> Emir already mentioned tuning *reclaimDeletesWeight which *affects segments >> about to merge priority. Optimising index time by time, preferably >> scheduling weekly / fortnight / ..., at low traffic period to never be in >> such odd position of 80% deleted docs in total index. >> >> Amrit Sarkar >> Search Engineer >> Lucidworks, Inc. >> 415-589-9269 >> www.lucidworks.com >> Twitter http://twitter.com/lucidworks >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 >> >> On Wed, Oct 4, 2017 at 6:02 PM, Emir Arnautović < >> emir.arnauto...@sematext.com> wrote: >> >> > Hi Markus, >> > You can set reclaimDeletesWeight in merge settings to some higher value >> > than default (I think it is 2) to favor segments with deleted docs when >> > merging. >> > >> > HTH, >> > Emir >> > -- >> > Monitoring - Log Management - Alerting - Anomaly Detection >> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >> > >> > >> > >> > > On 4 Oct 2017, at 13:31, Markus Jelsma <markus.jel...@openindex.io> >> > wrote: >> > > >> > > Hello, >> > > >> > > Using a 6.6.0, i just spotted one of our collections having a core of >> > which over 80 % of the total number of documents were deleted documents. >> > > >> > > It has <mergePolicyFactory >> > > class="org.apache.solr.index.TieredMergePolicyFactory"/> >> > configured with no non-default settings. >> > > >> > > Is this supposed to happen? How can i prevent these kind of numbers? >> > > >> > > Thanks, >> > > Markus >> > >> > >>