Re: [PR] BooleanScorer doesn't optimize for TwoPhaseIterator [lucene]

via GitHub Sun, 16 Mar 2025 13:31:31 -0700


jpountz commented on PR #14357:
URL: https://github.com/apache/lucene/pull/14357#issuecomment-2727625240


   > If one or more DISI has a high cost (irrespective of TPIs), thus matching 
many docs, I could see avoiding BS1 as well.
   
   I imagine that your idea is that if most of the cost comes from a single 
DISI then the heap doesn't need to be reorderer every time so the heap 
reordering overhead may be less than the bitset overhead. `BooleanScorer` has 
an optimization for a similar situation where if a single clause matches on a 
window of 4,096 docs then it will skip the bit set as an intermediate 
representation of matches and feed this clause directly into the collector.
   
   > An aside, if we are going to refer to these as BS1 vs BS2, they should 
have names more clearly reflecting this.
   
   Agreed to stop using `BS1`/`BS2`. It's how Lucene's disjunction scorers were 
historically referred to and these names came naturally after reading your 
description message that abbreviated `BooleanScorer` to `BS`, but it doesn't 
properly reflect how Lucene scores disjunctions nowadays.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] BooleanScorer doesn't optimize for TwoPhaseIterator [lucene]

Reply via email to