[
https://issues.apache.org/jira/browse/KAFKA-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17878006#comment-17878006
]
Bruno Cadonna commented on KAFKA-17445:
---------------------------------------
[~rohitbobade] It is not clear to me what you tried to achieve by setting
{{group.instance.id}}. Could yo please elaborate?
Did you increase {{session.timeout.ms}} as described in the config definition
(https://kafka.apache.org/documentation/#consumerconfigs_group.instance.id)
Could you describe the exact steps?
Did you delete the consumer group on the broker between the attempts?
Was this a new Streams app or an existing one?
> Kafka streams keeps rebalancing with the following reasons
> ----------------------------------------------------------
>
> Key: KAFKA-17445
> URL: https://issues.apache.org/jira/browse/KAFKA-17445
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 3.8.0
> Reporter: Rohit Bobade
> Priority: Major
>
> We recently upgraded Kafka streams version to 3.8.0 and are seeing that the
> streams app keeps rebalancing and does not process any events
> We have explicitly set the config
> GROUP_INSTANCE_ID_CONFIG
> This is what we see on the broker logs:
> [GroupCoordinator 2]: Preparing to rebalance group \{consumer-group-name} in
> state PreparingRebalance with old generation 24781 (__consumer_offsets-29)
> (reason: Updating metadata for static member {} with instance id {}; client
> reason: rebalance failed due to UnjoinedGroupException)
> We also tried to remove the GROUP_INSTANCE_ID_CONFIG but then see these logs
> and rebalancing and no processing still
> sessionTimeoutMs=45000, rebalanceTimeoutMs=1800000,
> supportedProtocols=List(stream)) has left group \{groupId} through explicit
> `LeaveGroup`; client reason: the consumer unsubscribed from all topics
> (kafka.coordinator.group.GroupCoordinator)
> other logs show:
> during Stable; client reason: need to revoke partitions and re-join)
> client reason: triggered followup rebalance scheduled for 0
> On the application logs we see:
> 1. state being restored from changelog topic
> 2. INFO org.apache.kafka.streams.processor.internals.StreamThread -
> stream-thread at state RUNNING: partitions lost due to missed rebalance.
> Detected that the thread is being fenced. This implies that this thread
> missed a rebalance and dropped out of the consumer group. Will close out all
> assigned tasks and rejoin the consumer group.
>
> 3. Task Migrated exceptions
> org.apache.kafka.streams.errors.TaskMigratedException: Error encountered
> sending record to topic
> org.apache.kafka.common.errors.InvalidProducerEpochException: Producer with
> transactionalId
> attempted to produce with an old epoch
> Written offsets would not be recorded and no more records would be sent since
> the producer is fenced, indicating the task may be migrated out; it means all
> tasks belonging to this thread should be migrated.
> at
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:306)
> ~[kafka-streams-3.8.0.jar:?]
> at
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.lambda$send$1(RecordCollectorImpl.java:286)
> ~[kafka-streams-3.8.0.jar:?]
> at
> datadog.trace.instrumentation.kafka_clients.KafkaProducerCallback.onCompletion(KafkaProducerCallback.java:44)
> ~[?:?]
> at
> org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:1106)
> ~[kafka-clients-3.8.0.jar:?]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)