junrao commented on code in PR #15618:
URL: https://github.com/apache/kafka/pull/15618#discussion_r1543430656
##########
core/src/main/scala/kafka/log/UnifiedLog.scala:
##########
@@ -1320,10 +1320,8 @@ class UnifiedLog(@volatile var logStartOffset: Long,
// constant time access while being safe to use with concurrent
collections unlike `toArray`.
val segmentsCopy = logSegments.toBuffer
val latestTimestampSegment = segmentsCopy.maxBy(_.maxTimestampSoFar)
- val latestTimestampAndOffset =
latestTimestampSegment.maxTimestampAndOffsetSoFar
-
- Some(new TimestampAndOffset(latestTimestampAndOffset.timestamp,
- latestTimestampAndOffset.offset,
+ val batch =
latestTimestampSegment.log.batches().asScala.maxBy(_.maxTimestamp())
Review Comment:
@chia7712 :` latestTimestampSegment.log.batches()` scans the whole log
segment and could introduce unnecessary extra I/O. So, there could be
performance degradation because of that.
> Hence we have to use condition baseOffset <= offset <= lastOffset to find
batch.
I am not sure I understand this. Looking up for a batch with each baseOffset
or lastOffset will locate the same batch using the offset index, right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]