[
https://issues.apache.org/jira/browse/LUCENE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440873#comment-17440873
]
Zach Chen commented on LUCENE-10061:
------------------------------------
{quote}Thanks for exploring this area [~zacharymorn]!
{quote}
No problem, I'm always interested in exploring and learning about lucene
querying!
{quote}I wonder if LUCENE-9335 could be helpful to reduce the overhead of
pruning, since Maxscore tends to be have lower overhead than WAND.
{quote}
I think in my current understanding and testing of CombinedFieldQuery,
WANDScorer is not used there. In addition, the PR is already doing
Maxscore-like calculation based on competitive impacts to skip docs. Am I
missing anything here?
{quote}I see that you tested with 4 and 2 as boost values. I wonder if it makes
a difference if you try out e.g. 20 and 1 instead. I just looked again at table
3.1 on
[https://www.staff.city.ac.uk/~sbrp622/papers/foundations_bm25_review.pdf] and
the optimal weights that they found for title/body were 38.4/1 on one dataset
and 13.5/1 on another dataset.
{quote}
Sounds good will give that a try!
> CombinedFieldsQuery needs dynamic pruning support
> -------------------------------------------------
>
> Key: LUCENE-10061
> URL: https://issues.apache.org/jira/browse/LUCENE-10061
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: CombinedFieldQueryTasks.wikimedium.10M.nostopwords.tasks
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> CombinedFieldQuery's Scorer doesn't implement advanceShallow/getMaxScore,
> forcing Lucene to collect all matches in order to figure the top-k hits.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]