fthevenet opened a new pull request, #12212:
URL: https://github.com/apache/lucene/pull/12212

   This PR aims to address issue #12211: Searches made via DrillSideways may 
miss documents that should match the query.
   
   A more detailed explanation of the issue and the reasoning behind the fix 
can be found in the report linked above, but it basically boils down to the 
fact that the `score` method in `DrillSidewaysScorer` results in more than one 
consecutive call to the `matches` method exposed by the `TwoPhaseIterator` 
instance without re-positioning the iterator first.
   This in turn, makes the matching of documents erratic, and lead to more or 
less subtle issues in searches from the end user's view point, where documents 
that should match there query sometime don't appear in the result list (but may 
still do if other factors such as the inner type of query resulting from 
parsing, caching, order in which docs where indexed, etc...)
   
   I propose solving the issue by initializing the position of the scorer by 
call nextDoc on `baseApproximation` instead of `baseIterator`, which should 
produced the expected result regardless of the type of iterator.
   
   In fact, looking back through the history of this code, it feels to me that 
calling nextDoc on baseIterator is a left over from before two phase iterator 
where introduced, and should have been changed then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to