junrao commented on code in PR #15621:
URL: https://github.com/apache/kafka/pull/15621#discussion_r1548322220
##########
clients/src/main/java/org/apache/kafka/common/record/MemoryRecordsBuilder.java:
##########
@@ -240,25 +240,40 @@ public MemoryRecords build() {
return builtRecords;
}
+
/**
- * Get the max timestamp and its offset. The details of the offset
returned are a bit subtle.
- * Note: The semantic for the offset of max timestamp is the first offset
with the max timestamp if there are multi-records having same timestamp.
- *
- * If the log append time is used, the offset will be the first offset of
the record.
- *
- * If create time is used, the offset will always be the offset of the
record with the max timestamp.
- *
- * If it's NO_TIMESTAMP (i.e. MAGIC_VALUE_V0), we'll return offset -1
since no timestamp info in records.
+ * We want to align the shallowOffsetOfMaxTimestamp for all paths (leader
append, follower append, and index recovery)
+ * The definition of shallowOffsetOfMaxTimestamp is the last offset of the
batch having max timestamp.
+ * If there are many batches having same max timestamp, we pick up the
earliest batch.
+ * <p>
+ * If the log append time is used, the offset will be the last offset
unless no compression is used and
+ * the message format version is 0 or 1, in which case, it will be -1.
Review Comment:
Still not very accurate. For message format version 0, the offset will be
-1. For message format version 1, the offset will be the first offset.
##########
clients/src/main/java/org/apache/kafka/common/record/MemoryRecordsBuilder.java:
##########
@@ -240,25 +240,40 @@ public MemoryRecords build() {
return builtRecords;
}
+
/**
- * Get the max timestamp and its offset. The details of the offset
returned are a bit subtle.
- * Note: The semantic for the offset of max timestamp is the first offset
with the max timestamp if there are multi-records having same timestamp.
- *
- * If the log append time is used, the offset will be the first offset of
the record.
- *
- * If create time is used, the offset will always be the offset of the
record with the max timestamp.
- *
- * If it's NO_TIMESTAMP (i.e. MAGIC_VALUE_V0), we'll return offset -1
since no timestamp info in records.
+ * We want to align the shallowOffsetOfMaxTimestamp for all paths (leader
append, follower append, and index recovery)
+ * The definition of shallowOffsetOfMaxTimestamp is the last offset of the
batch having max timestamp.
+ * If there are many batches having same max timestamp, we pick up the
earliest batch.
+ * <p>
+ * If the log append time is used, the offset will be the last offset
unless no compression is used and
+ * the message format version is 0 or 1, in which case, it will be -1.
+ * <p>
+ * If create time is used, the offset will be the last offset unless no
compression is used and the message
+ * format version is 0 or 1, in which case, it will be the offset of the
record with the max timestamp.
Review Comment:
For message format version 0, the offset will be -1. For message format
version 1, the offset will be the offset of the record with the max timestamp.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]