[ https://issues.apache.org/jira/browse/LUCENE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501791#comment-17501791 ]
Adrien Grand commented on LUCENE-10078: --------------------------------------- These two policies already have a concept of a "minimum segment size", it's called {{minSize}} on {{LogMergePolicy}} and {{floorSegmentBytes}} on {{TieredMergePolicy}}. I wonder if it would make sense to reuse this parameter for simplicity, and merge on flush any segment that would be smaller than this size, or if we should configure the maximum size for merge-on-flush via a different parameter. > Enable merge-on-refresh by default? > ----------------------------------- > > Key: LUCENE-10078 > URL: https://issues.apache.org/jira/browse/LUCENE-10078 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Priority: Major > > This is a spinoff from the discussion in LUCENE-10073. > The newish merge-on-refresh ([crazy origin > story|https://blog.mikemccandless.com/2021/03/open-source-collaboration-or-how-we.html]) > feature is a powerful way to reduce searched segment counts, especially > helpful for applications using many indexing threads. Such usage will write > many tiny segments on each refresh, which could quickly be merged up during > the {{refresh}} operation. > We would have to implement a default for {{findFullFlushMerges}} > (LUCENE-10064 is open for this), and then we would need > {{IndexWriterConfig.getMaxFullFlushMergeWaitMillis}} a non-zero value (this > issue). -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org