jainankitk commented on code in PR #14397:
URL: https://github.com/apache/lucene/pull/14397#discussion_r2021750116


##########
lucene/core/src/java/org/apache/lucene/codecs/lucene90/LZ4WithPresetDictCompressionMode.java:
##########
@@ -98,12 +98,17 @@ public void decompress(DataInput in, int originalLength, 
int offset, int length,
       final int blockLength = in.readVInt();
 
       final int numBlocks = readCompressedLengths(in, originalLength, 
dictLength, blockLength);
-
-      buffer = ArrayUtil.growNoCopy(buffer, dictLength + blockLength);
       bytes.length = 0;
-      // Read the dictionary
-      if (LZ4.decompress(in, dictLength, buffer, 0) != dictLength) {
-        throw new CorruptIndexException("Illegal dict length", in);
+      if (reused) {
+        assert buffer.length >= dictLength + blockLength;
+        in.skipBytes(compressedLengths[0]);
+      } else {
+        // Read the dictionary
+        buffer = ArrayUtil.growNoCopy(buffer, dictLength + blockLength);
+        if (LZ4.decompress(in, dictLength, buffer, 0) != dictLength) {
+          throw new CorruptIndexException("Illegal dict length", in);
+        }
+        reused = true;

Review Comment:
   I am wondering if we should consider exposing metric on how many times we 
could reuse, and how many times had to read from the disk? That would provide 
some useful insights on the usefulness of this change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to