[ 
https://issues.apache.org/jira/browse/LUCENE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Meisels updated LUCENE-10650:
------------------------------------
    Description: 
Hi!

We have been using an old version of elasticsearch with the following settings:

 
{code:java}
        "default": {
          "queryNorm": "1",
          "type": "DFR",
          "basic_model": "in",
          "after_effect": "no",
          "normalization": "no"
        }{code}
 

I see here that "after_effect": "no" was removed.

In 
[old|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L33]
 version score was:
{{}}
{code:java}
return tfn * (float)(log2((N + 1) / (n + 0.5)));{code}
{{}}
In 
[new|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L43]
 version it's:
{code:java}
long N = stats.getNumberOfDocuments();
long n = stats.getDocFreq();
double A = log2((N + 1) / (n + 0.5));
// basic model I should return A * tfn
// which we rewrite to A * (1 + tfn) - A
// so that it can be combined with the after effect while still guaranteeing
// that the result is non-decreasing with tfn
return A * aeTimes1pTfn * (1 - 1 / (1 + tfn));
{code}

I tried changing to "l" but the scoring is different than what we are used to. 
(We depend heavily on the exact scoring).

Do you have any advice how we can keep the same scoring as before?

Thanks

  was:
Hi!

We have been using an old version of elasticsearch with the following settings:

 
{code:java}
        "default": {
          "queryNorm": "1",
          "type": "DFR",
          "basic_model": "in",
          "after_effect": "no",
          "normalization": "no"
        }{code}
 

I see here that "after_effect": "no" was removed.

In 
[old|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L33]
 version score was:
{{}}
{code:java}
return tfn * (float)(log2((N + 1) / (n + 0.5)));{code}
{{}}
In 
[new|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L43]
 version it's:
    long N = stats.getNumberOfDocuments();
    long n = stats.getDocFreq();
    double A = log2((N + 1) / (n + 0.5));

    // basic model I(n) should return A * tfn
    // which we rewrite to A * (1 + tfn) - A
    // so that it can be combined with the after effect while still guaranteeing
    // that the result is non-decreasing with tfn

    return A * aeTimes1pTfn * (1 - 1 / (1 + tfn));
I tried changing to "l" but the scoring is different than what we are used to. 
(We depend heavily on the exact scoring).

Do you have any advice how we can keep the same scoring as before?

Thanks


> "after_effect": "no" was removed what replaces it?
> --------------------------------------------------
>
>                 Key: LUCENE-10650
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10650
>             Project: Lucene - Core
>          Issue Type: Wish
>            Reporter: Nathan Meisels
>            Priority: Major
>
> Hi!
> We have been using an old version of elasticsearch with the following 
> settings:
>  
> {code:java}
>         "default": {
>           "queryNorm": "1",
>           "type": "DFR",
>           "basic_model": "in",
>           "after_effect": "no",
>           "normalization": "no"
>         }{code}
>  
> I see here that "after_effect": "no" was removed.
> In 
> [old|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L33]
>  version score was:
> {{}}
> {code:java}
> return tfn * (float)(log2((N + 1) / (n + 0.5)));{code}
> {{}}
> In 
> [new|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L43]
>  version it's:
> {code:java}
> long N = stats.getNumberOfDocuments();
> long n = stats.getDocFreq();
> double A = log2((N + 1) / (n + 0.5));
> // basic model I should return A * tfn
> // which we rewrite to A * (1 + tfn) - A
> // so that it can be combined with the after effect while still guaranteeing
> // that the result is non-decreasing with tfn
> return A * aeTimes1pTfn * (1 - 1 / (1 + tfn));
> {code}
> I tried changing to "l" but the scoring is different than what we are used 
> to. (We depend heavily on the exact scoring).
> Do you have any advice how we can keep the same scoring as before?
> Thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to