[
https://issues.apache.org/jira/browse/KAFKA-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17802024#comment-17802024
]
Satish Duggana commented on KAFKA-16073:
----------------------------------------
We discussed one possible solution is to address it by updating
local-log-start-offset before the segments are removed from inmemory and
scheduled for deletion but we need to think through the end to end scenarios.
cc [~Kamal C]
> Kafka Tiered Storage Bug: Consumer Fetch Error Due to Delayed
> localLogStartOffset Update During Segment Deletion
> ----------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-16073
> URL: https://issues.apache.org/jira/browse/KAFKA-16073
> Project: Kafka
> Issue Type: Bug
> Components: core, Tiered-Storage
> Affects Versions: 3.6.1
> Reporter: hzh0425
> Assignee: hzh0425
> Priority: Major
> Labels: KIP-405, kip-405, tiered-storage
> Fix For: 3.6.1
>
>
> The identified bug in Apache Kafka's tiered storage feature involves a
> delayed update of {{localLogStartOffset}} in the
> {{UnifiedLog.deleteSegments}} method, impacting consumer fetch operations.
> When segments are deleted from the log's memory state, the
> {{localLogStartOffset}} isn't promptly updated. Concurrently,
> {{ReplicaManager.handleOffsetOutOfRangeError}} checks if a consumer's fetch
> offset is less than the {{{}localLogStartOffset{}}}. If it's greater, Kafka
> erroneously sends an {{OffsetOutOfRangeException}} to the consumer.
> In a specific concurrent scenario, imagine sequential offsets: {{{}offset1 <
> offset2 < offset3{}}}. A client requests data at {{{}offset2{}}}. While a
> background deletion process removes segments from memory, it hasn't yet
> updated the {{LocalLogStartOffset}} from {{offset1}} to {{{}offset3{}}}.
> Consequently, when the fetch offset ({{{}offset2{}}}) is evaluated against
> the stale {{offset1}} in {{{}ReplicaManager.handleOffsetOutOfRangeError{}}},
> it incorrectly triggers an {{{}OffsetOutOfRangeException{}}}. This issue
> arises from the out-of-sync update of {{{}localLogStartOffset{}}}, leading to
> incorrect handling of consumer fetch requests and potential data access
> errors.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)