[ https://issues.apache.org/jira/browse/LUCENE-9822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295404#comment-17295404 ]
Michael McCandless commented on LUCENE-9822: -------------------------------------------- +1, no unit test needed for one-line {{assert}} addition. Thanks [~gsmiller]! {quote}But if you are trying to do something like blocksize=512, seems like you would need to allow for more exceptions (e.g. 12 or something) for the patching to be effective for general purposes. Maybe worth checking literature as I don't know off the top of my head where these numbers (128, 3) etc came from. {quote} +1 – seems (naively) like the number of exceptions should probably grow linearly? We could probably make some crazy offline tool that gathers all the ints we are encoding into a given index and then measures what compression we could achieve with different numbers of patched exceptions. > Assert that ForUtil.BLOCK_SIZE can be encoded in a single byte in PForUtil > -------------------------------------------------------------------------- > > Key: LUCENE-9822 > URL: https://issues.apache.org/jira/browse/LUCENE-9822 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs > Affects Versions: master (9.0) > Reporter: Greg Miller > Priority: Trivial > Attachments: LUCENE-9822.patch > > > PForUtil assumes that ForUtil.BLOCK_SIZE can be encoded in a single byte when > generating "patch offsets". If this assumption doesn't hold, PForUtil will > silently encode incorrect positions. While the BLOCK_SIZE isn't particularly > configurable, it would be nice to assert this assumption early in PForUtil in > the even that the BLOCK_SIZE changes in some future codec version. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org