jpountz opened a new pull request, #12198:
URL: https://github.com/apache/lucene/pull/12198

   lucene-util's `IndexGeoNames` benchmark is heavily contended when running 
with many indexing threads, 20 in my case. The main offender is 
`DocumentsWriterFlushControl#doAfterDocument`, which runs after every index 
operation to update doc and RAM accounting.
   
   This change reduces contention by only updating RAM accounting if the amount 
of RAM consumption that has not been committed yet by a single DWPT is at least 
0.1% of the total RAM buffer size. This effectively batches updates to RAM 
accounting, similarly to what happens when using `IndexWriter#addDocuments` to 
index multiple documents at once. Since updates to RAM accounting may be 
batched, `FlushPolicy` can no longer distinguish between inserts, updates and 
deletes, so all 3 methods got merged into a single one.
   
   With this change, `IndexGeoNames` goes from ~22s to ~19s and the main 
offender for contention is now `DocumentsWriterPerThreadPool#getAndLock`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to