[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316059#comment-17316059 ]
Adrien Grand commented on LUCENE-9335: -------------------------------------- [~zacharymorn] Yes that would be one idea. In the BMM paper (http://engineering.nyu.edu/~suel/papers/bmm.pdf) BMM is usually a bit slower than BMW but not always. I'd be curious to know whether we observe the same result in Lucene. Since we introduced BMW there have been a few reports that top-level disjunctions got slower. This is usually because there are many clauses in a disjunction that have about the same max score and BMW can hardly skip evaluating documents. In such cases we pay for the BMW overhead without enjoying any benefits. Because BMM has less overhead, I would expect it to perform better in these worst-case scenarios, so I wonder if we should look into using BMM for top-level disjunctions in general. > Add a bulk scorer for disjunctions that does dynamic pruning > ------------------------------------------------------------ > > Key: LUCENE-9335 > URL: https://issues.apache.org/jira/browse/LUCENE-9335 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > > Lucene often gets benchmarked against other engines, e.g. against Tantivy and > PISA at [https://tantivy-search.github.io/bench/] or against research > prototypes in Table 1 of > [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf]. > Given that top-level disjunctions of term queries are commonly used for > benchmarking, it would be nice to optimize this case a bit more, I suspect > that we could make fewer per-document decisions by implementing a BulkScorer > instead of a Scorer. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org