[
https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064294#comment-17064294
]
Michele Palmia commented on LUCENE-9269:
----------------------------------------
I have a few questions, please feel free to let me know if they're too dumb:
- while testing a solution for adding {{perReaderTermState}} to the current
{{TermQuery#equals}} implementation, I found a test that I believe is not doing
anything of what it was designed to do - essentially it was rewritten for an
only tangentially related change, and it's been working as no-op since (test is
[TestMultiTermQueryRewrites#checkBoosts|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/test/org/apache/lucene/search/TestMultiTermQueryRewrites.java#L215],
problematic edit was
[this|https://github.com/apache/lucene-solr/commit/30807709e663c35f6760084632407dc1bf76aff7#diff-581d1e68f090e657acc327fc90534c51],
missing essential {{initialSeekTerm}}). Should I fix it as part of my proposal
for this or open a new issue?
- What's your opinion on comparing two TermQueries only one of which has a
{{perReaderTermState}}? I'd say the're different, but their Weights could
ultimately end up using the exact same statistics.
- Changing {{equals}} without changing {{toString}} mean errors like
{code:java}
expected:<foo:bar> but was:<foo:bar>
{code}
are possible. That seems to me less of an issue than adding df/ttf to the
TermQuery representation. Is that so?
> Blended queries with boolean rewrite can result in inconstitent scores
> ----------------------------------------------------------------------
>
> Key: LUCENE-9269
> URL: https://issues.apache.org/jira/browse/LUCENE-9269
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/search
> Affects Versions: 8.4
> Reporter: Michele Palmia
> Priority: Minor
> Attachments: LUCENE-9269-test.patch
>
>
> If two blended queries are should clauses of a boolean query and are built so
> that
> * some of their terms are the same
> * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE
> the docFreq for the overlapping terms used for scoring is picked as follow:
> # if the overlapping terms are not boosted, the df of the term in the first
> blended query is used
> # if any of the overlapping terms is boosted, the df is picked at (what
> looks like) random.
> A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3).
> {code:java}
> a)
> Blended(f:a f:b) Blended (f:a)
> df: 3 df: 2
> gets rewritten to:
> (f:a)^2.0 (f:b)
> df: 3 df:2
> b)
> Blended(f:a) Blended(f:a f:b)
> df: 2 df: 3
> gets rewritten to:
> (f:a)^2.0 (f:b)
> df: 2 df:2
> c)
> Blended(f:a f:b^0.66) Blended (f:a^0.75)
> df: 3 df: 2
> gets rewritten to:
> (f:a)^1.75 (f:b)^0.66
> df:? df:2
> {code}
> with ? either 2 or 3, depending on the run.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]