Yu Wang created KAFKA-20394:
-------------------------------
Summary: Selector Channel not syneced with Connections State and
Metadata
Key: KAFKA-20394
URL: https://issues.apache.org/jira/browse/KAFKA-20394
Project: Kafka
Issue Type: Bug
Components: consumer, network, producer
Reporter: Yu Wang
h3. Problem:
*Initial State:* Broker 8 is running on IP address A
*Metadata State:* ProducerMetadata correctly shows Broker 8 → IP A
*Channel State:* Sender has active channel for Broker 8 connected to IP A
*Infrastructure Change:* IP A is reassigned from Broker 8 to Broker 19 on
another Kafka cluster
h3. Post-Change State:
✅ ProducerMetadata correctly updates: Broker 8 → IP B, Broker 19 → IP A
❌ Sender channels still maintain: Broker 8 → IP A (stale connection)
h3. Observed Symptoms
Heap dump analysis shows correct metadata but incorrect channel mappings
Connection failures when attempting to send to Broker 8
Messages may be incorrectly routed to Broker 19 (now on IP A)
Network timeouts and reconnection attempts
h3. Root cause
Metadata refresh does not check IP change in the process, which makes the
NetworkClient and Selector still using stale IP.
The socketChannel set keepalive (by default it will keep the TCP for 2hours 10
mins), so if the IP changed to another Kafka pod in short time, the network
connection will be incorrect.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)