jpountz opened a new pull request, #12198: URL: https://github.com/apache/lucene/pull/12198
lucene-util's `IndexGeoNames` benchmark is heavily contended when running with many indexing threads, 20 in my case. The main offender is `DocumentsWriterFlushControl#doAfterDocument`, which runs after every index operation to update doc and RAM accounting. This change reduces contention by only updating RAM accounting if the amount of RAM consumption that has not been committed yet by a single DWPT is at least 0.1% of the total RAM buffer size. This effectively batches updates to RAM accounting, similarly to what happens when using `IndexWriter#addDocuments` to index multiple documents at once. Since updates to RAM accounting may be batched, `FlushPolicy` can no longer distinguish between inserts, updates and deletes, so all 3 methods got merged into a single one. With this change, `IndexGeoNames` goes from ~22s to ~19s and the main offender for contention is now `DocumentsWriterPerThreadPool#getAndLock`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org