[jira] [Commented] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

Adrien Grand (Jira) Tue, 17 May 2022 06:19:05 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17538183#comment-17538183
 ]


Adrien Grand commented on LUCENE-10574:
---------------------------------------

I was assuming we wanted to have strong guarantees about the number of segments 
in the index at search time, but it's a fair point that degrading to O(n^2) 
merging to meet this guarantee is not a good trade-off.

I tried to think of ways we could do this. One obvious option is to remove 
{{floorSegmentBytes}}, but this might be a bit too extreme as it would allow 
any index to have a long tail of small segments? One idea I started playing 
with consists of ensuring that every merge grows the largest input segment by 
at least some fraction, e.g. 50%. It tries to strike a balance between avoiding 
pathological merging and still trying to keep the number of segments contained 
at search time. I quickly hacked this into TieredMergePolicy and this made the 
StoredFieldsBenchmark more than 2x faster. I wonder if there are other 
approaches we should consider.

> Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't 
> do this
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-10574
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10574
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Priority: Major
>
> Remove {{floorSegmentBytes}} parameter, or change lucene's default to a merge 
> policy that doesn't merge in an O(n^2) way.
> I have the feeling it might have to be the latter, as folks seem really wed 
> to this crazy O(n^2) behavior.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10574) Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't do this

Reply via email to