[ 
https://issues.apache.org/jira/browse/LUCENE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17538787#comment-17538787
 ] 

Michael McCandless commented on LUCENE-10574:
---------------------------------------------

I like [~jpountz]'s approach!

It forces the "below floor" merges to not be pathological by insisting that the 
sizes of the segments being merged are somewhat balanced (less balanced than 
once the segments are over the floor size). The cost is O(N * log(N)) again, 
with a higher constant factor, not O(N^2) anymore.  Progress not perfection (hi 
[~dweiss]).

I do think (long-term) we should consider removing the floor entirely (open a 
follow-on issue after [~jpountz]'s PR), perhaps only once we enable 
merge-on-refresh by default. Applications that flush/refresh/commit tiny 
segments would pay a higher search-time price for the long tail of minuscule 
segments, but that is already an inefficient thing to do and so those users 
perhaps are not optimizing / caring about performance. If you follow the best 
practice for faster indexing (and you use merge-on-refresh/commit) you should 
be unaffected by completely removal of the floor merge size.

> Remove O(n^2) from TieredMergePolicy or change defaults to one that doesn't 
> do this
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-10574
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10574
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Remove {{floorSegmentBytes}} parameter, or change lucene's default to a merge 
> policy that doesn't merge in an O(n^2) way.
> I have the feeling it might have to be the latter, as folks seem really wed 
> to this crazy O(n^2) behavior.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to