[ 
https://issues.apache.org/jira/browse/LUCENE-9822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295492#comment-17295492
 ] 

Greg Miller commented on LUCENE-9822:
-------------------------------------

Yeah, interesting. Looking at the code, we're packing the number of bits used 
per entry along with the number of patches in a single byte. Because we max out 
at 32 bits/entry, we can encode the number of bits/entry in 5 bits, leaving 3 
more for the number of patches. Seems like an interesting experiment to bring 
in one more byte for encoding the number of patches, significantly raising the 
ceiling on how many entries we can patch in. Just a quick though from looking 
at the code, but I'll see if I can dig into the literature a little.

> Assert that ForUtil.BLOCK_SIZE can be encoded in a single byte in PForUtil
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-9822
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9822
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>    Affects Versions: master (9.0)
>            Reporter: Greg Miller
>            Priority: Trivial
>         Attachments: LUCENE-9822.patch
>
>
> PForUtil assumes that ForUtil.BLOCK_SIZE can be encoded in a single byte when 
> generating "patch offsets". If this assumption doesn't hold, PForUtil will 
> silently encode incorrect positions. While the BLOCK_SIZE isn't particularly 
> configurable, it would be nice to assert this assumption early in PForUtil in 
> the even that the BLOCK_SIZE changes in some future codec version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to