gsmiller opened a new issue, #11922:
URL: https://github.com/apache/lucene/issues/11922

   ### Description
   
   I believe we have an opportunity to improve disjunction evaluation by "short 
circuiting" within DisjunctionDISIApproximation / DisjunctionScorer. When a 
disjunction clause _does not_ require scores, we don't need to actually advance 
all the postings to a common docID in all cases. With `#advance(doc)`, we 
should be able to short-circuit as soon as we find one postings list that 
contains the target doc. With `#next()`, we can do the same thing if we treat 
the target as "current doc + 1". When confirming a second-phase match, we can 
also take advantage of this by lazily advancing as necessary until we find a 
match confirmation (in the worst case, if all two-phase checks fail, we'll 
still have to advance all the postings that "trail" the target doc).
   
   Of course, any time scoring is provided by the disjunction, we have to 
advance everything, so this doesn't help there. But we've got some use-cases 
right now that use large disjunctions as filters, and it would be nice to 
short-circuit when possible.
   
   I've got a prototype of this idea that I _think_ is functionally correct and 
will put up a draft PR soon to better illustrate what I'm thinking. In the 
meantime, please poke holes in this idea if I'm totally off the mark or 
overlooking something important :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to