[jira] [Resolved] (LUCENE-9725) Allow BM25FQuery to use other similarities

Julie Tibshirani (Jira) Thu, 04 Feb 2021 14:28:07 -0800


     [ 
https://issues.apache.org/jira/browse/LUCENE-9725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Julie Tibshirani resolved LUCENE-9725.
--------------------------------------
    Fix Version/s: 8.9
       Resolution: Fixed

> Allow BM25FQuery to use other similarities
> ------------------------------------------
>
>                 Key: LUCENE-9725
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9725
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Julie Tibshirani
>            Priority: Major
>             Fix For: 8.9
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> From a high level, BM25FQuery works as follows:
> # Given a list of fields and weights, it pretends there's a synthetic 
> combined field where all terms have been indexed. It computes new term and 
> collection statistics for this combined field.
> # It uses a disjunction iterator and BM25Similarity to score the documents.
> The steps are (1) compute statistics that represent the combined field 
> content, and (2) pass these to a similarity function. There is nothing really 
> specific to BM25Similarity in this approach. In step 2, we could use another 
> similarity, for example BooleanSimilarity or those based on language models 
> like LMDirichletSimilarity. The main restriction is that norms have to be 
> additive (the norm of the combined field must be the sum of the field norms).
> Maybe we could unhardcode BM25Similarity in BM25FQuery and instead use the 
> one configured on IndexSearcher. We could think of this as providing a 
> sensible default approach to cross-field scoring for many similarities. It's 
> an incremental step towards LUCENE-8711, which would give similarities more 
> fine-grained control over how stats/ scores are combined across fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (LUCENE-9725) Allow BM25FQuery to use other similarities

Reply via email to