[
https://issues.apache.org/jira/browse/KAFKA-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17161468#comment-17161468
]
Konstantin Lalafaryan commented on KAFKA-10253:
-----------------------------------------------
[~ChrisEgerton] Do you need some more info ?
> Kafka Connect gets into an infinite rebalance loop
> --------------------------------------------------
>
> Key: KAFKA-10253
> URL: https://issues.apache.org/jira/browse/KAFKA-10253
> Project: Kafka
> Issue Type: Bug
> Components: KafkaConnect
> Affects Versions: 2.5.0
> Reporter: Konstantin Lalafaryan
> Priority: Blocker
>
> Hello everyone!
>
> We are running kafka-connect cluster (3 workers) and very often it gets into
> an infinite rebalance loop.
>
> {code:java}
> 2020-07-09 08:51:25,731 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,731 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,733 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655831
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655831 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,736 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655832
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655832 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,740 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655833
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655833 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,744 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655834
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655834 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,748 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655835
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655835 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,751 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655836
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655836 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,755 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655837
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655837 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,759 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655838
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655838 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,763 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655839
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation 305655839 with protocol version 2
> and got assignment: Assignment{error=1,
> leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb',
> leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[],
> taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with
> rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,768 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Was selected to perform assignments, but do not have latest
> config found in sync request. Returning an empty configuration to trigger
> re-sync.
> (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,771 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Successfully joined group with generation 305655840
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [DistributedHerder-connect-1-1]
> 2020-07-09 08:51:25,771 INFO [Worker clientId=connect-1, groupId=
> kafka-connect] Joined group at generation
> {code}
> It is happening in all 3 workers.
>
> And in the broker side we can see following:
> {code:java}
> 2020-07-09 16:39:46,260 INFO [GroupCoordinator 0]: Preparing to rebalance
> group kafka-connect in state PreparingRebalance with old generation 311127279
> (__consumer_offsets-7) (reason: Updating metadata for member
> connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-0]
> 2020-07-09 16:39:46,261 INFO [GroupCoordinator 0]: Stabilized group
> kafka-connect generation 311127280 (__consumer_offsets-7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-5]
> 2020-07-09 16:39:46,262 INFO [GroupCoordinator 0]: Assignment received from
> leader for group kafka-connect for generation 311127280
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,265 INFO [GroupCoordinator 0]: Preparing to rebalance
> group kafka-connect in state PreparingRebalance with old generation 311127280
> (__consumer_offsets-7) (reason: Updating metadata for member
> connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,266 INFO [GroupCoordinator 0]: Stabilized group
> kafka-connect generation 311127281 (__consumer_offsets-7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-6]
> 2020-07-09 16:39:46,267 INFO [GroupCoordinator 0]: Assignment received from
> leader for group kafka-connect for generation 311127281
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,270 INFO [GroupCoordinator 0]: Preparing to rebalance
> group kafka-connect in state PreparingRebalance with old generation 311127281
> (__consumer_offsets-7) (reason: Updating metadata for member
> connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-7]
> 2020-07-09 16:39:46,271 INFO [GroupCoordinator 0]: Stabilized group
> kafka-connect generation 311127282 (__consumer_offsets-7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-6]
> 2020-07-09 16:39:46,272 INFO [GroupCoordinator 0]: Assignment received from
> leader for group kafka-connect for generation 311127282
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,275 INFO [GroupCoordinator 0]: Preparing to rebalance
> group kafka-connect in state PreparingRebalance with old generation 311127282
> (__consumer_offsets-7) (reason: Updating metadata for member
> connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-3]
> 2020-07-09 16:39:46,276 INFO [GroupCoordinator 0]: Stabilized group
> kafka-connect generation 311127283 (__consumer_offsets-7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-7]
> 2020-07-09 16:39:46,277 INFO [GroupCoordinator 0]: Assignment received from
> leader for group kafka-connect for generation 311127283
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-5]
> 2020-07-09 16:39:46,280 INFO [GroupCoordinator 0]: Preparing to rebalance
> group kafka-connect in state PreparingRebalance with old generation 311127283
> (__consumer_offsets-7) (reason: Updating metadata for member
> connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-5]
> 2020-07-09 16:39:46,281 INFO [GroupCoordinator 0]: Stabilized group
> kafka-connect generation 311127284 (__consumer_offsets-7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-7]
> 2020-07-09 16:39:46,282 INFO [GroupCoordinator 0]: Assignment received from
> leader for group kafka-connect for generation 311127284
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-3]
> 2020-07-09 16:39:46,285 INFO [GroupCoordinator 0]: Preparing to rebalance
> group kafka-connect in state PreparingRebalance with old generation 311127284
> (__consumer_offsets-7) (reason: Updating metadata for member
> connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-1]
> 2020-07-09 16:39:46,286 INFO [GroupCoordinator 0]: Stabilized group
> kafka-connect generation 311127285 (__consumer_offsets-7)
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-4]
> 2020-07-09 16:39:46,287 INFO [GroupCoordinator 0]: Assignment received from
> leader for group kafka-connect for generation 311127285
> (kafka.coordinator.group.GroupCoordinator)
> [data-plane-kafka-request-handler-7]
> {code}
>
> Any feedback is appreciated!
> Thanks!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)