kkewwei commented on code in PR #14397:
URL: https://github.com/apache/lucene/pull/14397#discussion_r2048995276


##########
lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsReader.java:
##########
@@ -512,6 +512,7 @@ private void doReset(int docID) throws IOException {
           bytes.offset = bytes.length = 0;
           for (int decompressed = 0; decompressed < totalLength; ) {
             final int toDecompress = Math.min(totalLength - decompressed, 
chunkSize);
+            decompressor.reset();
             decompressor.decompress(fieldsStream, toDecompress, 0, 
toDecompress, spare);

Review Comment:
   I tried but failed in just relying on outer `reuseIfPossible` to decide 
whether to cache PreSet Dict , In the follow case, outer must call the `reset` 
to clear the cache.
   
    We have two chunks:
   1. chunk0 [doc0(length>0)]
   2. chunk1[doc0(length=0), doc1(length=1)]
   
   Steps are as follow:
   1. Reading the chunk0/doc0, `reuseIfPossible`=false
   3. Reading the chunk1/doc0, `reuseIfPossible`=false. As length is 0, lucene 
will not read the `predict`, the PreSet Dict is not cached.
   4. Reading the chunk1/doc1. In the case, doc1 is in the current chunk1, 
`reuseIfPossible`=true, but the PreSet Dict is not cached, lucene will throw 
exception.
   
   In the case, we should call `reset` in the step1.
   
   



##########
lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsReader.java:
##########
@@ -512,6 +512,7 @@ private void doReset(int docID) throws IOException {
           bytes.offset = bytes.length = 0;
           for (int decompressed = 0; decompressed < totalLength; ) {
             final int toDecompress = Math.min(totalLength - decompressed, 
chunkSize);
+            decompressor.reset();
             decompressor.decompress(fieldsStream, toDecompress, 0, 
toDecompress, spare);

Review Comment:
   I tried but failed in just relying on outer `reuseIfPossible` to decide 
whether to cache PreSet Dict , In the follow case, outer must call the `reset` 
to clear the cache.
   
    We have two chunks:
   1. chunk0 [doc0(length>0)]
   2. chunk1[doc0(length=0), doc1(length=1)]
   
   Steps are as follow:
   1. Reading the chunk0/doc0, `reuseIfPossible`=false
   3. Reading the chunk1/doc0, `reuseIfPossible`=false. As length is 0, lucene 
will not read the `predict`, the PreSet Dict is not cached.
   4. Reading the chunk1/doc1. In the case, doc1 is in the current chunk1, 
`reuseIfPossible`=true, but the PreSet Dict is not cached for now, lucene will 
throw exception.
   
   In the case, we should call `reset` in the step1.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to