[
https://issues.apache.org/jira/browse/KAFKA-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mickael Maison reassigned KAFKA-16157:
--------------------------------------
Assignee: Gaurav Narula
> Topic recreation with offline disk doesn't update leadership/shrink ISR
> correctly
> ---------------------------------------------------------------------------------
>
> Key: KAFKA-16157
> URL: https://issues.apache.org/jira/browse/KAFKA-16157
> Project: Kafka
> Issue Type: Bug
> Components: jbod, kraft
> Affects Versions: 3.7.0
> Reporter: Gaurav Narula
> Assignee: Gaurav Narula
> Priority: Blocker
> Fix For: 3.7.0
>
> Attachments: broker.log, broker.log.1, broker.log.10, broker.log.2,
> broker.log.3, broker.log.4, broker.log.5, broker.log.6, broker.log.7,
> broker.log.8, broker.log.9
>
>
> In a cluster with 4 brokers, `broker-1..broker-4` with 2 disks `d1` and `d2`
> in each broker, we perform the following operations:
>
> # Create a topic `foo.test` with 10 partitions and RF 4. Let's assume the
> topic was created with id `rAujIqcjRbu_-E4UxgQT8Q`.
> # Start a producer in the background to produce to `foo.test`.
> # Break disk `d1` in `broker-1`. We simulate this by marking the log dir
> read-only.
> # Delete topic `foo.test`
> # Recreate topic `foo.test`. Let's assume the topic was created with id
> `bgdrsv-1QjCLFEqLOzVCHg`.
> # Wait for 5 minutes
> # Describe the recreated topic `foo.test`.
>
> We observe that `broker-1` is the leader and in-sync for few partitions
>
>
> {code:java}
>
> Topic: foo.test TopicId: bgdrsv-1QjCLFEqLOzVCHg PartitionCount: 10
> ReplicationFactor: 4 Configs:
> min.insync.replicas=1,unclean.leader.election.enable=false
> Topic: foo.test Partition: 0 Leader: 101 Replicas:
> 101,102,103,104 Isr: 101,102,103,104
> Topic: foo.test Partition: 1 Leader: 102 Replicas:
> 102,103,104,101 Isr: 102,103,104
> Topic: foo.test Partition: 2 Leader: 103 Replicas:
> 103,104,101,102 Isr: 103,104,102
> Topic: foo.test Partition: 3 Leader: 104 Replicas:
> 104,101,102,103 Isr: 104,102,103
> Topic: foo.test Partition: 4 Leader: 104 Replicas:
> 104,102,101,103 Isr: 104,102,103
> Topic: foo.test Partition: 5 Leader: 102 Replicas:
> 102,101,103,104 Isr: 102,103,104
> Topic: foo.test Partition: 6 Leader: 101 Replicas:
> 101,103,104,102 Isr: 101,103,104,102
> Topic: foo.test Partition: 7 Leader: 103 Replicas:
> 103,104,102,101 Isr: 103,104,102
> Topic: foo.test Partition: 8 Leader: 101 Replicas:
> 101,102,104,103 Isr: 101,102,104,103
> Topic: foo.test Partition: 9 Leader: 102 Replicas:
> 102,104,103,101 Isr: 102,104,103
> {code}
>
>
> In this example, it is the leader of partitions `0, 6 and 8`.
>
> Consider `foo.test-8`. It is present in the following brokers/disks:
>
>
> {code:java}
> $ fd foo.test-8
> broker-1/d1/foo.test-8/
> broker-2/d2/foo.test-8/
> broker-3/d2/foo.test-8/
> broker-4/d1/foo.test-8/{code}
>
>
> `broker-1/d1` still refers to the topic id which is pending deletion because
> the log dir is marked offline.
>
>
> {code:java}
> $ cat broker-1/d1/foo.test-8/partition.metadata
> version: 0
> topic_id: rAujIqcjRbu_-E4UxgQT8Q{code}
>
>
> However, other brokers have the correct topic-id
>
>
> {code:java}
> $ cat broker-2/d2/foo.test-8/partition.metadata
> version: 0
> topic_id: bgdrsv-1QjCLFEqLOzVCHg%{code}
>
>
> Now, let's consider `foo.test-0`. We observe that the replica isn't present
> in `broker-1`:
> {code:java}
> $ fd foo.test-0
> broker-2/d1/foo.test-0/
> broker-3/d1/foo.test-0/
> broker-4/d2/foo.test-0/{code}
> In both cases, `broker-1` shouldn't be the leader or in-sync replica for the
> partitions.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)