[PR] Made DocIdsWriter use DISI when reading documents with an IntersectVisitor [lucene]

via GitHub Fri, 01 Mar 2024 01:54:33 -0800


antonha opened a new pull request, #13149:
URL: https://github.com/apache/lucene/pull/13149


   Instead of calling `IntersectVisitor.visit` for each doc in the 
`readDelta16` and `readInts32` methods, create a `DocIdSetIterator` and call 
`IntersectVisitor.visit(DocIdSetIterator)` instead.
   
   This seems to make Lucene faster at some sorting and range querying tasks - 
I saw **35-45% reduction in execution time**. In learnt this through running 
this benchmark setup by Elastic: 
https://github.com/elastic/elasticsearch-opensearch-benchmark. 
   
   The hypothesis is that this is due to fewer virtual calls being made - once 
per BKD leaf, instead of once per document. Note that this is only measurable 
if the readInts methods have been called with at least 3 implementation of the 
IntersectVisitor interface - otherwise the JIT inlining takes away the virtual 
call. In real life Lucene deployments, I would judge that it is very likely 
that at least 3 implementations are used. For more details on method etc, there 
are details in this blog post: https://blunders.io/posts/es-benchmark-4-inlining
   
   I tried benchmarking this with luceneutil, but did not see any significant 
change with the default benchmark - I suspect that I'm using the wrong 
luceneutil tasks to see any major difference. Which luceneutils benchmarks 
should I be using for these changes?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[PR] Made DocIdsWriter use DISI when reading documents with an IntersectVisitor [lucene]

Reply via email to