[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326323#comment-17326323 ]
Zach Chen commented on LUCENE-9335: ----------------------------------- Hi [~jpountz], I took a stab at implementing BMM and published a new PR here for further discussion [https://github.com/apache/lucene/pull/101] . I'm pretty happy about being able to implement a new scorer, even though its performance is a bit poor (although seems to be on par with the experiment result published in [http://engineering.nyu.edu/~suel/papers/bmm.pdf] for BMM and BMW comparison for 2-clause OR query). Shall we consider adding benchmark query set with 5+ clauses to see the performance comparison, as that seems to be when BMM may outperform BMW as the paper suggested? > Add a bulk scorer for disjunctions that does dynamic pruning > ------------------------------------------------------------ > > Key: LUCENE-9335 > URL: https://issues.apache.org/jira/browse/LUCENE-9335 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > Lucene often gets benchmarked against other engines, e.g. against Tantivy and > PISA at [https://tantivy-search.github.io/bench/] or against research > prototypes in Table 1 of > [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf]. > Given that top-level disjunctions of term queries are commonly used for > benchmarking, it would be nice to optimize this case a bit more, I suspect > that we could make fewer per-document decisions by implementing a BulkScorer > instead of a Scorer. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org