Re: [PR] Bump block size of postings to 256. [lucene]

via GitHub Mon, 08 Sep 2025 13:13:01 -0700


jpountz commented on PR #15160:
URL: https://github.com/apache/lucene/pull/15160#issuecomment-3267800260


   I did the codec dance so it's now ready for review. I'd rather not wait too 
long before merging because there will be conflicts that will be annoying to 
resolve. I don't expect many folks to look at the diff, but the important 
things to note are that:
    - The block size goes from 128 to 256. For everything: docs, freqs, 
positions, offsets, payload lengths.
    - Heap usage of postings enums increases a bit.
    - There are still 2 levels of skip data, but every 256 and 8,192 docs 
instead of every 128 and 4,096 docs.
    - The max number of exceptions for PFOR is still 7. I did not bump it to 15 
because the flag byte we're using takes 5 bits for the number of bits per value 
and 3 bits for the number of exceptions, and I didn't want to store it on 2 
bytes.
    - The optimization to compute the prefix sum on partially-decoded data is 
gone. We now fully decode the data before computing the prefix sum.
    - When moving the Lucene103 codec to backward-codecs, I removed the 
optimization to use the vector API to speed up decoding postings so that I 
would not have to give access to this API from lucene/backward-codecs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Bump block size of postings to 256. [lucene]

Reply via email to