[jira] [Commented] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning

Adrien Grand (Jira) Wed, 07 Apr 2021 00:17:07 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17316059#comment-17316059
 ]


Adrien Grand commented on LUCENE-9335:
--------------------------------------

[~zacharymorn] Yes that would be one idea. In the BMM paper 
(http://engineering.nyu.edu/~suel/papers/bmm.pdf) BMM is usually a bit slower 
than BMW but not always. I'd be curious to know whether we observe the same 
result in Lucene.

Since we introduced BMW there have been a few reports that top-level 
disjunctions got slower. This is usually because there are many clauses in a 
disjunction that have about the same max score and BMW can hardly skip 
evaluating documents. In such cases we pay for the BMW overhead without 
enjoying any benefits. Because BMM has less overhead, I would expect it to 
perform better in these worst-case scenarios, so I wonder if we should look 
into using BMM for top-level disjunctions in general.

> Add a bulk scorer for disjunctions that does dynamic pruning
> ------------------------------------------------------------
>
>                 Key: LUCENE-9335
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9335
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>
> Lucene often gets benchmarked against other engines, e.g. against Tantivy and 
> PISA at [https://tantivy-search.github.io/bench/] or against research 
> prototypes in Table 1 of 
> [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf].
>  Given that top-level disjunctions of term queries are commonly used for 
> benchmarking, it would be nice to optimize this case a bit more, I suspect 
> that we could make fewer per-document decisions by implementing a BulkScorer 
> instead of a Scorer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9335) Add a bulk scorer for disjunctions that does dynamic pruning

Reply via email to