vigyasharma commented on code in PR #966:
URL: https://github.com/apache/lucene/pull/966#discussion_r901050635


##########
lucene/core/src/java/org/apache/lucene/index/TermsHashPerField.java:
##########
@@ -230,9 +230,29 @@ final void writeByte(int stream, byte b) {
   }
 
   final void writeBytes(int stream, byte[] b, int offset, int len) {
-    // TODO: optimize
     final int end = offset + len;
-    for (int i = offset; i < end; i++) writeByte(stream, b[i]);
+    int streamAddress = streamAddressOffset + stream;
+    int upto = termStreamAddressBuffer[streamAddress];
+    byte[] slice = bytePool.buffers[upto >> ByteBlockPool.BYTE_BLOCK_SHIFT];
+    assert slice != null;
+    int sliceOffset = upto & ByteBlockPool.BYTE_BLOCK_MASK;
+
+    while (slice[sliceOffset] == 0 && offset < end) {
+      slice[sliceOffset++] = b[offset++];
+      (termStreamAddressBuffer[streamAddress])++;
+    }
+
+    while (offset < end) {
+      int offsetAndLength = bytePool.allocKnownSizeSlice(slice, sliceOffset);
+      sliceOffset = offsetAndLength >> 8;
+      int sliceLength = offsetAndLength & 0xff;
+      slice = bytePool.buffer;
+      int writeLength = Math.min(sliceLength - 1, end - offset);
+      System.arraycopy(b, offset, slice, sliceOffset, writeLength);
+      sliceOffset += writeLength;
+      offset += writeLength;
+      termStreamAddressBuffer[streamAddress] = sliceOffset + 
bytePool.byteOffset;
+    }

Review Comment:
   Ah, I see what you are saying. This is some deep coupling between the 
`ByteSliceReader` and the `ByteBlockPool`. 
   
   `ByteSliceReader.nextSlice()` just assumes that the slice is the next higher 
level in `ByteBlockPool.NEXT_LEVEL_ARRAY`. Wonder if we could set the 
`ByteSliceReader.level` to the right value before calling `nextSlice()`. The 
buffer seems to contain the next level value [in its last few 
bits](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/util/ByteBlockPool.java#L268-L269).
 `allocKnownSizeSlice()` would then return the next higher slice size for 
`(end-offset)` in `LEVEL_SIZE_ARRAY`.
   
   It might be a more intrusive change though. ByteBlockPool docstring calls 
out that it allocates *"slices of increasing lengths"*, so there might be other 
places that hold this expectation. Maybe worth playing around in a separate 
spin off PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to