Re: [PR] Bump block size of postings to 256. [lucene]

via GitHub Fri, 05 Sep 2025 08:42:06 -0700


jpountz commented on PR #15160:
URL: https://github.com/apache/lucene/pull/15160#issuecomment-3258824979


   This is due to the fact that this query needs to decode more integers 
overall. Conceptually moving to a block size of 256 is similar to keeping a 
block size of 128 but always decoding two consecutive blocks together. 
Previously there may have been cases when we would have decoded a 128-docs 
block but skipped the next one because its impacts reported that none of the 
docs that it contained had a competitive score. This skipping can no longer 
happen with blocks of size 256, we would always decode or skip at least 256 
postings.
   
   All queries need to decode more integers. But there are also some efficiency 
gains from working on larger blocks. For instance all the decision making that 
we do to figure out if a block should be decoded or not (such as computing the 
maximum score of this block) gets amortized over more doc IDs. With some 
queries, the efficiency gains are bigger than the additional decoding overhead. 
With other queries, it's the opposite. `FilteredTerm` seems to be in the latter 
camp.
   
   We could look into evaluating `FilteredTerm` via `BlockMaxConjunctionScorer` 
instead of a `DefaultBulkScorer` on top of a filtered `ImpactsDISI`, this may 
help `FilteredTerm` a bit by vectorizing its execution logic more.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Bump block size of postings to 256. [lucene]

Reply via email to