[ https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345793#comment-17345793 ]
Zach Chen commented on LUCENE-9335: ----------------------------------- I made some changes to the BulkScorer implementations to return false for BMM eligibility immediately when non term query was identified, and they improved the benchmark results for Fuzzy1 & Fuzzy2 a bit ([https://github.com/apache/lucene/pull/113/commits/f4115f78be0833b65694ad6a0f9f4f32565091e7).] However, it appears that Fuzzy1 & Fuzzy2 benchmark results would vary more in general across runs / queries used compared to other tasks. > Add a bulk scorer for disjunctions that does dynamic pruning > ------------------------------------------------------------ > > Key: LUCENE-9335 > URL: https://issues.apache.org/jira/browse/LUCENE-9335 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > Attachments: wikimedium.10M.nostopwords.tasks, > wikimedium.10M.nostopwords.tasks.5OrMeds > > Time Spent: 6h 50m > Remaining Estimate: 0h > > Lucene often gets benchmarked against other engines, e.g. against Tantivy and > PISA at [https://tantivy-search.github.io/bench/] or against research > prototypes in Table 1 of > [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf]. > Given that top-level disjunctions of term queries are commonly used for > benchmarking, it would be nice to optimize this case a bit more, I suspect > that we could make fewer per-document decisions by implementing a BulkScorer > instead of a Scorer. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org