tang-hi commented on code in PR #966:
URL: https://github.com/apache/lucene/pull/966#discussion_r901039371
##########
lucene/core/src/java/org/apache/lucene/index/TermsHashPerField.java:
##########
@@ -230,9 +230,29 @@ final void writeByte(int stream, byte b) {
}
final void writeBytes(int stream, byte[] b, int offset, int len) {
- // TODO: optimize
final int end = offset + len;
- for (int i = offset; i < end; i++) writeByte(stream, b[i]);
+ int streamAddress = streamAddressOffset + stream;
+ int upto = termStreamAddressBuffer[streamAddress];
+ byte[] slice = bytePool.buffers[upto >> ByteBlockPool.BYTE_BLOCK_SHIFT];
+ assert slice != null;
+ int sliceOffset = upto & ByteBlockPool.BYTE_BLOCK_MASK;
+
+ while (slice[sliceOffset] == 0 && offset < end) {
+ slice[sliceOffset++] = b[offset++];
+ (termStreamAddressBuffer[streamAddress])++;
+ }
+
+ while (offset < end) {
+ int offsetAndLength = bytePool.allocKnownSizeSlice(slice, sliceOffset);
+ sliceOffset = offsetAndLength >> 8;
+ int sliceLength = offsetAndLength & 0xff;
+ slice = bytePool.buffer;
+ int writeLength = Math.min(sliceLength - 1, end - offset);
+ System.arraycopy(b, offset, slice, sliceOffset, writeLength);
+ sliceOffset += writeLength;
+ offset += writeLength;
+ termStreamAddressBuffer[streamAddress] = sliceOffset +
bytePool.byteOffset;
+ }
Review Comment:
> Wow, this is an interesting change! Can you help me better understand this
optimization?
>
> Is `System.arraycopy()` much faster than manually copying like in your
while loop above. I would guess it is similar since there are no extra checks
for primitive or object type [[ref: fast array copy
comparison](https://www.javaspecialists.eu/archive/Issue124-Copying-Arrays-Fast.html)],
but I'm not sure. This is probably better than calling `writeByte()` in a loop
because we don't need to fetch all the pointers every time.
>
> A follow up idea: I think `allocSlice()` (or `allocKnownSizeSlice()`
here), always returns the next higher sized buffer slice.. Maybe
`allocKnownSizeSlice` could instead return a slice appropriate for the length
of pending items we need to write `(end - offset)`? We could then avoid having
to call alloc multiple times for larger writes.
thanks for your comment
1. Is System.arraycopy() much faster than manually copying like in your
while loop above?
I think it's faster than a manual for loop to copy. Maybe you are
interested in taking a look at this answer[Is it better to use
System.arraycopy(...) than a for loop for copying
arrays?](https://stackoverflow.com/a/49269866)
2. Maybe allocKnownSizeSlice could instead return a slice appropriate for
the length of pending items we need to write (end - offset)?
In the begining, I also want to implement in this way.But I found the
method `nextSlice` in `ByteSliceReader` always read slice with size 5, 14, 20,
30 ..., 200,200 which is same with `LEVEL_SIZE_ARRAY` in `ByteBlockPool`. So I
have to allocate slice with next-level-size instead of `end - offset`,
##########
lucene/core/src/java/org/apache/lucene/index/TermsHashPerField.java:
##########
@@ -230,9 +230,29 @@ final void writeByte(int stream, byte b) {
}
final void writeBytes(int stream, byte[] b, int offset, int len) {
- // TODO: optimize
final int end = offset + len;
- for (int i = offset; i < end; i++) writeByte(stream, b[i]);
+ int streamAddress = streamAddressOffset + stream;
+ int upto = termStreamAddressBuffer[streamAddress];
+ byte[] slice = bytePool.buffers[upto >> ByteBlockPool.BYTE_BLOCK_SHIFT];
+ assert slice != null;
+ int sliceOffset = upto & ByteBlockPool.BYTE_BLOCK_MASK;
+
+ while (slice[sliceOffset] == 0 && offset < end) {
+ slice[sliceOffset++] = b[offset++];
+ (termStreamAddressBuffer[streamAddress])++;
+ }
+
+ while (offset < end) {
+ int offsetAndLength = bytePool.allocKnownSizeSlice(slice, sliceOffset);
+ sliceOffset = offsetAndLength >> 8;
+ int sliceLength = offsetAndLength & 0xff;
+ slice = bytePool.buffer;
+ int writeLength = Math.min(sliceLength - 1, end - offset);
+ System.arraycopy(b, offset, slice, sliceOffset, writeLength);
+ sliceOffset += writeLength;
+ offset += writeLength;
+ termStreamAddressBuffer[streamAddress] = sliceOffset +
bytePool.byteOffset;
+ }
Review Comment:
thanks for your comment
1. Is System.arraycopy() much faster than manually copying like in your
while loop above?
I think it's faster than a manual for loop to copy. Maybe you are
interested in taking a look at this answer[Is it better to use
System.arraycopy(...) than a for loop for copying
arrays?](https://stackoverflow.com/a/49269866)
2. Maybe allocKnownSizeSlice could instead return a slice appropriate for
the length of pending items we need to write (end - offset)?
In the begining, I also want to implement in this way.But I found the
method `nextSlice` in `ByteSliceReader` always read slice with size 5, 14, 20,
30 ..., 200,200 which is same with `LEVEL_SIZE_ARRAY` in `ByteBlockPool`. So I
have to allocate slice with next-level-size instead of `end - offset`,
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]