benwtrent commented on code in PR #12699: URL: https://github.com/apache/lucene/pull/12699#discussion_r1420733099
########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java: ########## @@ -1190,4 +1176,63 @@ public void seekExact(long ord) { public long ord() { throw new UnsupportedOperationException(); } + + static class OutputAccumulator extends DataInput { + + BytesRef[] outputs = new BytesRef[16]; + BytesRef current; + int num; + int outputIndex; + int index; + + void push(BytesRef output) { + if (output != Lucene90BlockTreeTermsReader.NO_OUTPUT) { Review Comment: While we have strict contracts, I can see that `BytesSequenceOutputs#add(BytesRef, BytesRef)` has assertions that the length is > 0. Here in `OutputAccumulator` `readByte()` makes a big assumption that a `BytesRef` has at least length 1. If it had a length of 0, we would read past the ref end and read bytes sitting in a `byte[]` that we shouldn't. IMO `OutputAccumulator` needs to be way more cautious than `BytesSequenceOutputs#add(BytesRef, BytesRef)` because `OutputAccumulator` isn't making copies and is relying on the underlying byte arrays not changing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org