Shawn's comment seems likely, somehow you're adding all the docs twice
and only committing at the end. In that case there'd be only 1
segment. That's about the only way I can imagine your index has
exactly one segment with exactly half the docs deleted.

It'd be interesting for you to look at the admin UI>>schema browser
for your <uniqueKey> field. It'll report the most frequent entries and
if every <uniqueKey> has exactly 2 entries, then you're indexing the
same docs twice in one go.

Plus, the default TieredMergePolicy doesn't necessarily kick in unless
there are multiple segments of roughly the same size. With an index
this small it's perfectly possible that TMP is getting triggered and
saying, in essence, "there's not enough work to do here to bother".

In Solr 7.5, you can optimize/forceMerge without any danger of
creating massive segments, see:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
(pre Solr 7.5)
and
https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/
(Solr 7.5+).

Best,
Erick
On Tue, Nov 27, 2018 at 4:29 AM Markus Jelsma
<markus.jel...@openindex.io> wrote:
>
> Hello,
>
> A background  batch process compiles a data set, when finished, it sends a 
> delete all to its target collection, then everything gets sent by SolrJ, 
> followed by a regular commit. When inspecting the core i notice it has one 
> segment with 9578 documents, of which exactly half are deleted.
>
> That Solr node is on 7.5, how can i encourage the merge scheduler to do its 
> job and merge away all those deletes?
>
> Thanks,
> Markus

Reply via email to