[PR] Move BooleanScorer to work on top of Scorers rather than BulkScorers. [lucene]

via GitHub Fri, 18 Oct 2024 08:50:20 -0700


jpountz opened a new pull request, #13931:
URL: https://github.com/apache/lucene/pull/13931


   I was looking at some queries where Lucene performs significantly worse than 
Tantivy at https://tantivy-search.github.io/bench/, and found out that we get 
quite some overhead from implementing `BooleanScorer` on top of `BulkScorer` 
(effectively implemented by `DefaultBulkScorer` since it only runs term queries 
as boolean clauses) rather than `Scorer` directly.
   
   The `CountOrHighHigh` and `CountOrHighMed` tasks are a bit noisy on my 
machine, so I did 3 runs on wikibigall, and all of them had speedups for these 
two tasks, often with a very low p-value.
   
   In theory, this change could make things slower when the inner query has a 
specialized bulk scorer, such as `MatchAllDocsQuery` or a conjunction. It does 
feel right to optimize for term queries though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[PR] Move BooleanScorer to work on top of Scorers rather than BulkScorers. [lucene]

Reply via email to