[ https://issues.apache.org/jira/browse/LUCENE-10650 ]


    Nathan Meisels deleted comment on LUCENE-10650:
    -----------------------------------------

was (Author: JIRAUSER292626):
Hi [~jpountz]!

Appreciate your help until now!

Another question.
I did a reindex and I get different scores.

query is:

 
{code:java}
{
  "query": {
    "term": {
      "sessionIds": "1234-1234"
    }
  }
}{code}
 

New index explain:
{code:java}
{
  "_index": "entities-new",
  "_type": "entity",
  "_id": "AWByRrSPIGshPfnDk4hN",
  "matched": true,
  "explanation": {
    "value": 22.941677,
    "description": "weight(sessionIds:1234-1234 in 1400) [PerFieldSimilarity], 
result of:",
    "details": [
      {
        "value": 22.941677,
        "description": "score from 
ScriptedSimilarity(weightScript=[Script{type=inline, lang='painless', 
idOrCode='return query.boost * 
Math.log((field.docCount+1.0)/(term.docFreq+0.5)) / Math.log(2);', options={}, 
params={}}], script=[Script{type=inline, lang='painless', idOrCode='return 
weight;', options={}, params={}}]) computed from:",
        "details": [
          {
            "value": 22.941677,
            "description": "weight",
            "details": []
          },
          {
            "value": 1.0,
            "description": "query.boost",
            "details": []
          },
          {
            "value": 12084378,
            "description": "field.docCount",
            "details": []
          },
          {
            "value": 4.730932E+7,
            "description": "field.sumDocFreq",
            "details": []
          },
          {
            "value": -1.0,
            "description": "field.sumTotalTermFreq",
            "details": []
          },
          {
            "value": 1.0,
            "description": "term.docFreq",
            "details": []
          },
          {
            "value": -1.0,
            "description": "term.totalTermFreq",
            "details": []
          },
          {
            "value": 1.0,
            "description": "doc.freq",
            "details": []
          },
          {
            "value": 1.0,
            "description": "doc.length",
            "details": []
          }
        ]
      }
    ]
  }
}{code}
 

Old index explain:
{code:java}
{
  "_index" : "entities-old",
  "_type" : "entity",
  "_id" : "AWByRrSPIGshPfnDk4hN",
  "matched" : true,
  "explanation" : {
    "value" : 21.23644,
    "description" : "weight(sessionIds:1234-1234 in 527154) 
[PerFieldSimilarity], result of:",
    "details" : [
      {
        "value" : 21.23644,
        "description" : "score(DFRSimilarity, doc=527154, freq=1.0), computed 
from:",
        "details" : [
          {
            "value" : 1.0,
            "description" : "no normalization",
            "details" : [ ]
          },
          {
            "value" : 21.23644,
            "description" : "BasicModelIn, computed from: ",
            "details" : [
              {
                "value" : 1.605901E7,
                "description" : "numberOfDocuments",
                "details" : [ ]
              },
              {
                "value" : 6.0,
                "description" : "docFreq",
                "details" : [ ]
              }
            ]
          },
          {
            "value" : 1.0,
            "description" : "no aftereffect",
            "details" : [ ]
          }
        ]
      }
    ]
  }
}{code}

Does this make sense? I need the scores to stay the same.

Thanks

 

> "after_effect": "no" was removed what replaces it?
> --------------------------------------------------
>
>                 Key: LUCENE-10650
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10650
>             Project: Lucene - Core
>          Issue Type: Wish
>            Reporter: Nathan Meisels
>            Priority: Major
>
> Hi!
> We have been using an old version of elasticsearch with the following 
> settings:
>  
> {code:java}
>         "default": {
>           "queryNorm": "1",
>           "type": "DFR",
>           "basic_model": "in",
>           "after_effect": "no",
>           "normalization": "no"
>         }{code}
>  
> I see [here|https://issues.apache.org/jira/browse/LUCENE-8015] that 
> "after_effect": "no" was removed.
> In 
> [old|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L33]
>  version score was:
> {code:java}
> return tfn * (float)(log2((N + 1) / (n + 0.5)));{code}
> In 
> [new|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L43]
>  version it's:
> {code:java}
> long N = stats.getNumberOfDocuments();
> long n = stats.getDocFreq();
> double A = log2((N + 1) / (n + 0.5));
> // basic model I should return A * tfn
> // which we rewrite to A * (1 + tfn) - A
> // so that it can be combined with the after effect while still guaranteeing
> // that the result is non-decreasing with tfn
> return A * aeTimes1pTfn * (1 - 1 / (1 + tfn));
> {code}
> I tried changing {color:#172b4d}after_effect{color} to "l" but the scoring is 
> different than what we are used to. (We depend heavily on the exact scoring).
> Do you have any advice how we can keep the same scoring as before?
> Thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to