junrao commented on code in PR #21379:
URL: https://github.com/apache/kafka/pull/21379#discussion_r2875399185
##########
storage/src/main/java/org/apache/kafka/storage/internals/log/Cleaner.java:
##########
@@ -302,19 +327,26 @@ public void cleanSegments(UnifiedLog log,
* @param upperBoundOffsetOfCleaningRound Next offset of the last batch in
the source segment
* @param stats Collector for cleaning statistics
* @param currentTime The time at which the clean was initiated
+ * @param log The log instance for creating new segments if overflow occurs
+ *
+ * @return The current active destination segment (maybe different from
input dest if overflow occurred)
Review Comment:
This api seems awkward. An alternative is to instead passing in a starting
position in sourceRecords to cleanInto(). Initially, we can pass in 0 as the
position, if cleanInto() hits a size limit, it throws an exception with the
current position in sourceRecords. The caller catches this exception, creates a
new destination segment and call cleanInto() again with position in the
exception and the new destination segment.
##########
storage/src/main/java/org/apache/kafka/storage/internals/log/Cleaner.java:
##########
@@ -400,6 +432,36 @@ public boolean shouldRetainRecord(RecordBatch batch,
Record record) {
if (outputBuffer.position() > 0) {
outputBuffer.flip();
MemoryRecords retained =
MemoryRecords.readableRecords(outputBuffer);
+
+ // Check for TWO types of overflow BEFORE appending:
+ // 1. Offset overflow: offset range exceeds Integer.MAX_VALUE
Review Comment:
The grouping of the segments takes offset overflow into consideration. So,
it seems that we can't hit this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]