jpountz opened a new pull request, #13931: URL: https://github.com/apache/lucene/pull/13931
I was looking at some queries where Lucene performs significantly worse than Tantivy at https://tantivy-search.github.io/bench/, and found out that we get quite some overhead from implementing `BooleanScorer` on top of `BulkScorer` (effectively implemented by `DefaultBulkScorer` since it only runs term queries as boolean clauses) rather than `Scorer` directly. The `CountOrHighHigh` and `CountOrHighMed` tasks are a bit noisy on my machine, so I did 3 runs on wikibigall, and all of them had speedups for these two tasks, often with a very low p-value. In theory, this change could make things slower when the inner query has a specialized bulk scorer, such as `MatchAllDocsQuery` or a conjunction. It does feel right to optimize for term queries though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org