Yu Wang created KAFKA-20394:
-------------------------------

             Summary: Selector Channel not syneced with Connections State and 
Metadata
                 Key: KAFKA-20394
                 URL: https://issues.apache.org/jira/browse/KAFKA-20394
             Project: Kafka
          Issue Type: Bug
          Components: consumer, network, producer 
            Reporter: Yu Wang


h3. Problem:

*Initial State:* Broker 8 is running on IP address A
*Metadata State:* ProducerMetadata correctly shows Broker 8 → IP A
*Channel State:* Sender has active channel for Broker 8 connected to IP A
*Infrastructure Change:* IP A is reassigned from Broker 8 to Broker 19 on 
another Kafka cluster
h3. Post-Change State:

✅ ProducerMetadata correctly updates: Broker 8 → IP B, Broker 19 → IP A
❌ Sender channels still maintain: Broker 8 → IP A (stale connection)
h3. Observed Symptoms

Heap dump analysis shows correct metadata but incorrect channel mappings
Connection failures when attempting to send to Broker 8
Messages may be incorrectly routed to Broker 19 (now on IP A)
Network timeouts and reconnection attempts
h3. Root cause

Metadata refresh does not check IP change in the process, which makes the 
NetworkClient and Selector still using stale IP.

The socketChannel set keepalive (by default it will keep the TCP for 2hours 10 
mins), so if the IP changed to another Kafka pod in short time, the network 
connection will be incorrect.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to