Michele Palmia created LUCENE-9269: -------------------------------------- Summary: Blended queries with boolean rewrite can result in inconstitent scores Key: LUCENE-9269 URL: https://issues.apache.org/jira/browse/LUCENE-9269 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 8.4 Reporter: Michele Palmia
If two blended queries are built so that * some of their terms are the same * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE the docFreq for the overlapping terms used for scoring is picked as follow: * if the overlapping terms are not boosted, the df of the term in the first blended query is used * if any of the overlapping terms is boosted, the df is picked at (what looks like) random. A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). {code:java} 1. Blended(f:a f:b) Blended (f:a) df: 3 df: 2 gets rewritten to: (f:a)^2.0 (f:b) df: 3 df:2 Blended(f:a) Blended(f:a f:b) df: 2 df: 3 gets rewritten to: (f:a)^2.0 (f:b) df: 2 df:2 Blended(f:a f:b^0.66) Blended (f:a^0.75) df: 3 df: 2 gets rewritten to: (f:a)^1.75 (f:b)^0.66 df:? df:2 {code} with ? either 2 or 3, depending on the run. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org