[
https://issues.apache.org/jira/browse/KAFKA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876222#comment-16876222
]
Sam Weston commented on KAFKA-7931:
-----------------------------------
Have you made any progress with this? I have the same problem if I lose more
than 1 node every 5 minutes or so, and I haven't worked out how to monitor for
it yet...
> Java Client: if all ephemeral brokers fail, client can never reconnect to
> brokers
> ---------------------------------------------------------------------------------
>
> Key: KAFKA-7931
> URL: https://issues.apache.org/jira/browse/KAFKA-7931
> Project: Kafka
> Issue Type: Bug
> Components: clients
> Affects Versions: 2.1.0
> Reporter: Brian
> Priority: Critical
>
> Steps to reproduce:
> * Setup kafka cluster in GKE, with bootstrap server address configured to
> point to a load balancer that exposes all GKE nodes
> * Run producer that emits values into a partition with 3 replicas
> * Kill every broker in the cluster
> * Wait for brokers to restart
> Observed result:
> The java client cannot find any of the nodes even though they have all
> recovered. I see messages like "Connection to node 30 (/10.6.0.101:9092)
> could not be established. Broker may not be available.".
> Note, this is *not* a duplicate of
> https://issues.apache.org/jira/browse/KAFKA-7890. I'm using the client
> version that contains the fix for
> https://issues.apache.org/jira/browse/KAFKA-7890.
> Versions:
> Kakfa: kafka version 2.1.0, using confluentinc/cp-kafka/5.1.0 docker image
> Client: trunk from a few days ago (git sha
> 9f7e6b291309286e3e3c1610e98d978773c9d504), to pull in the fix for KAFKA-7890
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)