[ https://issues.apache.org/jira/browse/LUCENE-9406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279289#comment-17279289 ]
Zach Chen commented on LUCENE-9406: ----------------------------------- Thanks Michael and Andrzej for the feedback here! +1 for simplification! When I proposed the interfaces above, I was also considering a way to consume the logged metrics there. Are we considering writing them out to some default file location with the *IndexWriterMetrics* interface above, or will something like the following work as well? {code:java} interface IndexWriterMetrics { beginMergeOnFullFlush(OneMerge merge); endMergeOnFullFlush(OneMerge merge); ... Map<MetricName, Metric> providesMetrics(); } {code} > Make it simpler to track IndexWriter's events > --------------------------------------------- > > Key: LUCENE-9406 > URL: https://issues.apache.org/jira/browse/LUCENE-9406 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index > Reporter: Michael McCandless > Priority: Major > > This is the second spinoff from a [controversial PR to add a new index-time > feature to Lucene to merge small segments during > commit|https://github.com/apache/lucene-solr/pull/1552]. That change can > substantially reduce the number of small index segments to search. > In that PR, there was a new proposed interface, {{IndexWriterEvents}}, giving > the application a chance to track when {{IndexWriter}} kicked off merges > during commit, how many, how long it waited, how often it gave up waiting, > etc. > Such telemetry from production usage is really helpful when tuning settings > like which merges (e.g. a size threshold) to attempt on commit, and how long > to wait during commit, etc. > I am splitting out this issue to explore possible approaches to do this. > E.g. [~simonw] proposed using a statistics class instead, but if I understood > that correctly, I think that would put the role of aggregation inside > {{IndexWriter}}, which is not ideal. > Many interesting events, e.g. how many merges are being requested, how large > are they, how long did they take to complete or fail, etc., can be gleaned by > wrapping expert Lucene classes like {{MergePolicy}} and {{MergeScheduler}}. > But for those events that cannot (e.g. {{IndexWriter}} stopped waiting for > merges during commit), it would be very helpful to have some simple way to > track so applications can better tune. > It is also possible to subclass {{IndexWriter}} and override key methods, but > I think that is inherently risky as {{IndexWriter}}'s protected methods are > not considered to be a stable API, and the synchronization used by > {{IndexWriter}} is confusing. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org