tkarampAlpha opened a new issue, #13796: URL: https://github.com/apache/lucene/issues/13796
### Description It seems that for SpanOrQuery IDF of terms belonging in subqueries that will not match a given document, will affect said document's score. I have observed this through on which I have 3 documents: ``` doc1: field: something doc2: field: nothing doc3: field: anything ``` And I issue the following query: ```spanOr([Contents:something, Contents:nothing])``` If you check at the score explanation you will notice that in both document's score the idf of both terms affects it even though for each document only one matches. This is an example of the explanation of the first document's score: ``` 3.9616547 = weight(spanOr([Contents:something, Contents:nothing]) in 0) [AsBM25Similarity], result of: 3.9616547 = score(freq=1.0), computed as boost * idf * tf from: 51.0 = boost 3.9616585 = idf, sum of: 1.9808292 = idf for term nothing , computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) + 1 from: 1 = docFreq 3 = docCount 1.9808292 = idf for term something , computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) + 1 from: 1 = docFreq 3 = docCount 0.019607842 = tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from: 1.0 = phraseFreq=1.0 50.0 = k1, term saturation parameter 0.0 = b, length normalization parameter 1.0 = dl, length of field 2.0 = avgdl, average length of field ``` ### Version and environment details lucene 9.7.0 through solr 9.3.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org