jpountz opened a new pull request, #14715:
URL: https://github.com/apache/lucene/pull/14715

   This test generates random boolean queries and ensures that setting a 
minimum number of matching SHOULD clauses returns a subset of the hits with the 
same scores.
   
   It already tries to work around accuracy loss due to arithmetic operations 
by allowing a delta of up to one ulp between these two queries. However, 
sometimes the delta can be higher.
   
   For instance consider the following query that triggered the most recent 
test failure: `(data:5 data:5 data:5 data:6 +data:6 data:Z data:X -data:1)~2`. 
Without a minimum number of matching SHOULD clauses, it gets rewritten to 
`(data:5^3 +data:6^2 data:Z data:X -data:1)`. So the score contribution of 
`data:5` is computed as `(double) score(data:5) + (double) score(data:5) + 
(double) score(data:5)` in one case, and `(double) (score(data:5: * 3f)` 
(multiply first, then cast to a double) in the other case. The use of 
`ReqOptSumScorer` also contributes accuracy losses as per existing comment, for 
instance `data:6` is part of both the required and the optional clauses in the 
first case, while it's only a required clauses (with a 2x boost) in the other 
case. So accuracy loss accrues differently.
   
   I don't think we should try too hard to avoid these accuracy losses, so I'm 
instead increasing the leniency of the test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to