nknize commented on code in PR #12688: URL: https://github.com/apache/lucene/pull/12688#discussion_r1382429884
########## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/lucene90/randomaccess/bitpacking/BitPacker.java: ########## Review Comment: > Since they are of the same size... That's the difference. In your use case the records (blocks) are guaranteed to be the same size where as in the serialized tree case the records (tree nodes) are not guaranteed to be the same size. This is by design to ensure the resulting docvalue disk consumption is as efficient (small) as possible. > ...by a quick glance it seems to me it encodes values with variable length (VInt, VLong). Maybe the random-access is achieved in different ways? Yes to variable length encoding. The "random-ness" isn't purely random in that traversal of the serialized tree is DFS. Because the tree nodes are variable size the serialized array includes copious "book-keeping" in the form of "sizeOf" values. At DFS traversal the first "sizeOf" value provides the size of the entire left tree. To prune the left tree just means we skip that many bytes to get to the right tree.. this continues recursively. In practice we don't expect to ever "back up" in our DFS traversal so there is only a `rewind` method that simply resets the offset values to 0. Seems the two use cases are subtly different but I could see roughly 80% overlap in the implementation. I'd love to efficiently encapsulate this logic for the next contributor that wants a random serialized traversal mechanism without a ridiculous amount of java object overhead. Sounds like @bruno-roustant had the same need? Could be a good follow on progress PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org