kkewwei commented on code in PR #14397: URL: https://github.com/apache/lucene/pull/14397#discussion_r2048995276
########## lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsReader.java: ########## @@ -512,6 +512,7 @@ private void doReset(int docID) throws IOException { bytes.offset = bytes.length = 0; for (int decompressed = 0; decompressed < totalLength; ) { final int toDecompress = Math.min(totalLength - decompressed, chunkSize); + decompressor.reset(); decompressor.decompress(fieldsStream, toDecompress, 0, toDecompress, spare); Review Comment: I tried but failed in just relying on outer `reuseIfPossible` to decide whether to cache PreSet Dict . In the follow case, outer must call the `reset` to clear the cache, we have two chunks: 1. chunk0 [doc0(length>0)] 2. chunk1[doc0(length=0), doc0(length=1)] Steps are as follow: 1. Reading the chunk0/doc0, `reuseIfPossible`=false 3. Reading the chunk1/doc0, `reuseIfPossible`=false. As length is 0, lucene will not read the `predict`, the PreSet Dict is not cached. 4. Reading the chunk1/doc1. In the case, doc1 is in the chunk1, `reuseIfPossible`=true, but the PreSet Dict is not cached, lucene will throw exception. In the case, we should call `reset` in the step1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org