dajac commented on code in PR #16498:
URL: https://github.com/apache/kafka/pull/16498#discussion_r1662541019
##########
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/runtime/CoordinatorRuntime.java:
##########
@@ -936,62 +941,90 @@ private void append(
));
}
- // Compute the estimated size of the records.
- int estimatedSize = AbstractRecords.estimateSizeInBytes(
- currentBatch.builder.magic(),
- compression.type(),
- recordsToAppend
- );
+ if (isAtomic) {
Review Comment:
> To confirm my understanding, we take the original path when the append
should be atomic, which means we verify whether all records can fit in the
current batch. If not, we allocate a new batch then append the records. If the
append does not need to be atomic, we append individual records until we can
fit no more, then allocate a new batch.
>
> This will reduce the number of writes to the log as we will have more
filled batches on average. In practice, only the unload partition and cleanup
group metadata jobs will have non-atomic writes today so we should not expect
much impact.
>
> Is this correct?
Yes. This is correct. However, I did not do this change to improve the how
batches are filled but rather to ensure that we would not be stuck in the case
where we have a large number of records that could not fit in one batch.
##########
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/runtime/CoordinatorRuntime.java:
##########
@@ -936,62 +941,90 @@ private void append(
));
}
- // Compute the estimated size of the records.
- int estimatedSize = AbstractRecords.estimateSizeInBytes(
- currentBatch.builder.magic(),
- compression.type(),
- recordsToAppend
- );
+ if (isAtomic) {
Review Comment:
> One other thing I noticed is the ordering. For the atomic case, we create
the batch first and then replay whereas non-atomic we do the replay then add to
batch. Not sure if it makes a big difference though since we moved where we
enqueue the event.
This is a very good point and the code was wrong. I changed it to check if
the record can fit in the batch before replaying. I added a test for this too.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]