[ 
https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319860#comment-17319860
 ] 

Zach Chen commented on LUCENE-9335:
-----------------------------------

{quote}bq.DisjunctionMaxScorer

It should be DisjunctionSumScorer if we want to get the same scoring 
(DisjunctionSumScorer sums up scores of the clauses while DisjunctionMaxScorer 
takes the max).

Can you run the benchmark with verifyCounts=False like we do for nightlies so 
that it would only check top hits? 
https://github.com/mikemccand/luceneutil/blob/fbc9ae15a1bb47f7e15c95fb70f1bda57faccfc1/src/python/nightlyBench.py#L822{quote}

Ah I was wondering about the nightly benchmark before, and I see now why 
Michael suggested the top N hit verification above.  Let me give that a try. I 
didn't use DisjunctionSumScorer earlier as I see it seems to be missing some 
block related method implementation, but I will use that to run the latest 
benchmark as well.

> Add a bulk scorer for disjunctions that does dynamic pruning
> ------------------------------------------------------------
>
>                 Key: LUCENE-9335
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9335
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>
> Lucene often gets benchmarked against other engines, e.g. against Tantivy and 
> PISA at [https://tantivy-search.github.io/bench/] or against research 
> prototypes in Table 1 of 
> [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf].
>  Given that top-level disjunctions of term queries are commonly used for 
> benchmarking, it would be nice to optimize this case a bit more, I suspect 
> that we could make fewer per-document decisions by implementing a BulkScorer 
> instead of a Scorer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to