[jira] [Commented] (LUCENE-10061) CombinedFieldsQuery needs dynamic pruning support

Adrien Grand (Jira) Mon, 08 Nov 2021 05:15:04 -0800


    [ 
https://issues.apache.org/jira/browse/LUCENE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440448#comment-17440448
 ]


Adrien Grand commented on LUCENE-10061:
---------------------------------------

Thanks for exploring this area [~zacharymorn]! I wonder if LUCENE-9335 could be 
helpful to reduce the overhead of pruning, since Maxscore tends to be have 
lower overhead than WAND.

I see that you tested with 4 and 2 as boost values. I wonder if it makes a 
difference if you try out e.g. 20 and 1 instead. I just looked again at table 
3.1 on https://www.staff.city.ac.uk/~sbrp622/papers/foundations_bm25_review.pdf 
and the optimal weights that they found for title/body were 38.4/1 on one 
dataset and 13.5/1 on another dataset.




> CombinedFieldsQuery needs dynamic pruning support
> -------------------------------------------------
>
>                 Key: LUCENE-10061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10061
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: CombinedFieldQueryTasks.wikimedium.10M.nostopwords.tasks
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> CombinedFieldQuery's Scorer doesn't implement advanceShallow/getMaxScore, 
> forcing Lucene to collect all matches in order to figure the top-k hits.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-10061) CombinedFieldsQuery needs dynamic pruning support

Reply via email to