[ https://issues.apache.org/jira/browse/LUCENE-10650 ]
Nathan Meisels deleted comment on LUCENE-10650: ----------------------------------------- was (Author: JIRAUSER292626): Hi [~jpountz]! Appreciate your help until now! Another question. I did a reindex and I get different scores. query is: {code:java} { "query": { "term": { "sessionIds": "1234-1234" } } }{code} New index explain: {code:java} { "_index": "entities-new", "_type": "entity", "_id": "AWByRrSPIGshPfnDk4hN", "matched": true, "explanation": { "value": 22.941677, "description": "weight(sessionIds:1234-1234 in 1400) [PerFieldSimilarity], result of:", "details": [ { "value": 22.941677, "description": "score from ScriptedSimilarity(weightScript=[Script{type=inline, lang='painless', idOrCode='return query.boost * Math.log((field.docCount+1.0)/(term.docFreq+0.5)) / Math.log(2);', options={}, params={}}], script=[Script{type=inline, lang='painless', idOrCode='return weight;', options={}, params={}}]) computed from:", "details": [ { "value": 22.941677, "description": "weight", "details": [] }, { "value": 1.0, "description": "query.boost", "details": [] }, { "value": 12084378, "description": "field.docCount", "details": [] }, { "value": 4.730932E+7, "description": "field.sumDocFreq", "details": [] }, { "value": -1.0, "description": "field.sumTotalTermFreq", "details": [] }, { "value": 1.0, "description": "term.docFreq", "details": [] }, { "value": -1.0, "description": "term.totalTermFreq", "details": [] }, { "value": 1.0, "description": "doc.freq", "details": [] }, { "value": 1.0, "description": "doc.length", "details": [] } ] } ] } }{code} Old index explain: {code:java} { "_index" : "entities-old", "_type" : "entity", "_id" : "AWByRrSPIGshPfnDk4hN", "matched" : true, "explanation" : { "value" : 21.23644, "description" : "weight(sessionIds:1234-1234 in 527154) [PerFieldSimilarity], result of:", "details" : [ { "value" : 21.23644, "description" : "score(DFRSimilarity, doc=527154, freq=1.0), computed from:", "details" : [ { "value" : 1.0, "description" : "no normalization", "details" : [ ] }, { "value" : 21.23644, "description" : "BasicModelIn, computed from: ", "details" : [ { "value" : 1.605901E7, "description" : "numberOfDocuments", "details" : [ ] }, { "value" : 6.0, "description" : "docFreq", "details" : [ ] } ] }, { "value" : 1.0, "description" : "no aftereffect", "details" : [ ] } ] } ] } }{code} Does this make sense? I need the scores to stay the same. Thanks > "after_effect": "no" was removed what replaces it? > -------------------------------------------------- > > Key: LUCENE-10650 > URL: https://issues.apache.org/jira/browse/LUCENE-10650 > Project: Lucene - Core > Issue Type: Wish > Reporter: Nathan Meisels > Priority: Major > > Hi! > We have been using an old version of elasticsearch with the following > settings: > > {code:java} > "default": { > "queryNorm": "1", > "type": "DFR", > "basic_model": "in", > "after_effect": "no", > "normalization": "no" > }{code} > > I see [here|https://issues.apache.org/jira/browse/LUCENE-8015] that > "after_effect": "no" was removed. > In > [old|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L33] > version score was: > {code:java} > return tfn * (float)(log2((N + 1) / (n + 0.5)));{code} > In > [new|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/lucene/core/src/java/org/apache/lucene/search/similarities/BasicModelIn.java#L43] > version it's: > {code:java} > long N = stats.getNumberOfDocuments(); > long n = stats.getDocFreq(); > double A = log2((N + 1) / (n + 0.5)); > // basic model I should return A * tfn > // which we rewrite to A * (1 + tfn) - A > // so that it can be combined with the after effect while still guaranteeing > // that the result is non-decreasing with tfn > return A * aeTimes1pTfn * (1 - 1 / (1 + tfn)); > {code} > I tried changing {color:#172b4d}after_effect{color} to "l" but the scoring is > different than what we are used to. (We depend heavily on the exact scoring). > Do you have any advice how we can keep the same scoring as before? > Thanks -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org