Emanuele Sabellico created KAFKA-17237:
------------------------------------------
Summary: [rack-aware assignors] Rebalance is triggered every time
a broker isn't reported from a metadata call
Key: KAFKA-17237
URL: https://issues.apache.org/jira/browse/KAFKA-17237
Project: Kafka
Issue Type: Bug
Components: clients
Affects Versions: 3.8.0, 3.5.0
Reporter: Emanuele Sabellico
Attachments: test.log
When configuring a client for rack-awareness to enable FFF and rack-aware
assignors, a rebalance is triggered every time a broker disappears from a
Metadata response, such as during a cluster roll.
That happens because after KIP 881 metadata appears as changed given the set of
racks is different (brokers that are down have no info about the rack)
*How to reproduce*
* Enable *client.rack* on the client and *broker.rack* on the cluster
* Create a topic with replicas on all the nodes
* Subscribe to that topic on the client
* Stop one of the brokers
* Observe a rebalance is triggered
Attached is a log reproducing the issue in Java client. A few lines showing the
rejoin requests
{noformat}
[2024-08-01 15:09:07,472] INFO [Consumer clientId=consumer-test_racks-1,
groupId=test_racks] Request joining group due to: cached metadata has changed
from (version4: {test_new=[racks=[null, 1b, 1c]]}) at the beginning of the
rebalance to (version5: {test_new=[racks=[1a, 1b, 1c]]})
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2024-08-01 15:10:38,689] INFO [Consumer clientId=consumer-test_racks-1,
groupId=test_racks] Request joining group due to: cached metadata has changed
from (version6: {test_new=[racks=[1a, 1b, 1c]]}) at the beginning of the
rebalance to (version42: {test_new=[racks=[null, 1a, 1c]]})
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2024-08-01 15:11:04,106] INFO [Consumer clientId=consumer-test_racks-1,
groupId=test_racks] Request joining group due to: cached metadata has changed
from (version43: {test_new=[racks=[1a, 1b, 1c]]}) at the beginning of the
rebalance to (version45: {test_new=[racks=[null, 1a, 1b]]})
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
{noformat}
Same happens in librdkafka as reported in this issue
[https://github.com/confluentinc/librdkafka/issues/4742]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)