[
https://issues.apache.org/jira/browse/LUCENE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319035#comment-17319035
]
Zach Chen commented on LUCENE-9335:
-----------------------------------
Thanks Michael for the comment! I went ahead and searched for the error message
"wrong hitCount" from above in luceneutil, and found these
#
[https://github.com/mikemccand/luceneutil/blob/8fb67282db936c8df33c05b896d92169579c1876/src/python/benchUtil.py#L177-L183]
(where "wrong hitCount" was generated)
#
[https://github.com/mikemccand/luceneutil/blob/8fb67282db936c8df33c05b896d92169579c1876/src/python/benchUtil.py#L1626-L1629
|https://github.com/mikemccand/luceneutil/blob/8fb67282db936c8df33c05b896d92169579c1876/src/python/benchUtil.py#L1626-L1629]
(where task was aborted when Runtime error was raised from above)
So I'm guessing that failure actually happened before verification of top N
hits as well as id / scores checks, and thus may mask further potential
failures and abort the task early (which also explained I got varying benchmark
results across multiple latest runs)?
> Add a bulk scorer for disjunctions that does dynamic pruning
> ------------------------------------------------------------
>
> Key: LUCENE-9335
> URL: https://issues.apache.org/jira/browse/LUCENE-9335
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
>
> Lucene often gets benchmarked against other engines, e.g. against Tantivy and
> PISA at [https://tantivy-search.github.io/bench/] or against research
> prototypes in Table 1 of
> [https://cs.uwaterloo.ca/~jimmylin/publications/Grand_etal_ECIR2020_preprint.pdf].
> Given that top-level disjunctions of term queries are commonly used for
> benchmarking, it would be nice to optimize this case a bit more, I suspect
> that we could make fewer per-document decisions by implementing a BulkScorer
> instead of a Scorer.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]