No, that collection never receives a forceMerge nor expungeDeletes. Almost all (99.999%) documents are overwritten every 90 minutes.
A single shard has 16k docs (97k total) but is only 300 MB large. Maybe that's a problem there. I can simply turn a switch to forgeMerge after the periodic update cycle, but i preferred Lucene to do it for me. Thanks, Markus -----Original message----- > From:Erick Erickson <erickerick...@gmail.com> > Sent: Wednesday 4th October 2017 14:56 > To: solr-user <solr-user@lucene.apache.org> > Subject: Re: Very high number of deleted docs > > Did you _ever_ do a forceMerge/optimize or expungeDeletes? > > Here's the problem TieredMergePolicy (TMP) has a maximum segment size > it will allow, 5G by default. No segment is even considered for > merging unless it has < 2.5G (or half whatever the default is) > non-deleted docs, the logic being that to merge similar size segments, > each has to be less than half the max size. > > However, optimize/forceMerge and expungeDeletes do not have a limit on > the segment size. So say you optimize at some point and have a 100G > segment. It won't get merged until you have 97.5G worth of deleted > docs. > > More here: > https://issues.apache.org/jira/browse/LUCENE-7976 > > Erick > > On Wed, Oct 4, 2017 at 5:47 AM, Markus Jelsma > <markus.jel...@openindex.io> wrote: > > Do you mean a periodic forceMerge? That is usually considered a bad habit > > on this list (i agree). It is just that i am actually very surprised this > > can happen at all with default settings. This factory, unfortunately does > > not seem to support settings configured in solrconfig. > > > > Thanks, > > Markus > > > > -----Original message----- > >> From:Amrit Sarkar <sarkaramr...@gmail.com> > >> Sent: Wednesday 4th October 2017 14:42 > >> To: solr-user@lucene.apache.org > >> Subject: Re: Very high number of deleted docs > >> > >> Hi Markus, > >> > >> Emir already mentioned tuning *reclaimDeletesWeight which *affects segments > >> about to merge priority. Optimising index time by time, preferably > >> scheduling weekly / fortnight / ..., at low traffic period to never be in > >> such odd position of 80% deleted docs in total index. > >> > >> Amrit Sarkar > >> Search Engineer > >> Lucidworks, Inc. > >> 415-589-9269 > >> www.lucidworks.com > >> Twitter http://twitter.com/lucidworks > >> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 > >> > >> On Wed, Oct 4, 2017 at 6:02 PM, Emir Arnautović < > >> emir.arnauto...@sematext.com> wrote: > >> > >> > Hi Markus, > >> > You can set reclaimDeletesWeight in merge settings to some higher value > >> > than default (I think it is 2) to favor segments with deleted docs when > >> > merging. > >> > > >> > HTH, > >> > Emir > >> > -- > >> > Monitoring - Log Management - Alerting - Anomaly Detection > >> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > >> > > >> > > >> > > >> > > On 4 Oct 2017, at 13:31, Markus Jelsma <markus.jel...@openindex.io> > >> > wrote: > >> > > > >> > > Hello, > >> > > > >> > > Using a 6.6.0, i just spotted one of our collections having a core of > >> > which over 80 % of the total number of documents were deleted documents. > >> > > > >> > > It has <mergePolicyFactory > >> > > class="org.apache.solr.index.TieredMergePolicyFactory"/> > >> > configured with no non-default settings. > >> > > > >> > > Is this supposed to happen? How can i prevent these kind of numbers? > >> > > > >> > > Thanks, > >> > > Markus > >> > > >> > > >> >