[ 
https://issues.apache.org/jira/browse/KAFKA-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varsha Abhinandan updated KAFKA-8673:
-------------------------------------
    Description: 
We observed a deadlock kind of a situation in our Kafka streams application 
when we accidentally shut down all the brokers. The Kafka cluster was brought 
back in about an hour. 

Observations made :
 # Normal Kafka producers and consumers started working fine after the brokers 
were up again. 
 # The Kafka streams applications were stuck in the "rebalancing" state.
 # The Kafka streams apps have exactly-once semantics enabled.
 # The stack trace showed most of the stream threads sending the join group 
requests to the group co-ordinator
 # Few stream threads couldn't initiate the join group request since the call 
to 
[org.apache.kafka.clients.producer.KafkaProducer#sendOffsetsToTransaction|https://jira.corp.appdynamics.com/browse/ANLYTCS_ES-2062#sendOffsetsToTransaction%20which%20was%20hung]
 was stuck.
 # Seems like the join group requests were getting parked at the coordinator 
since the expected members hadn't sent their own group join requests
 # And after the timeout, the stream threads that were not stuck sent a new 
join group requests.  
 # Maybe (6) and (7) is happening infinitely
 # Sample values of the GroupMetadata object on the group co-ordinator  !Screen 
Shot 2019-07-11 at 12.08.09 PM.png|width=319,height=53!

> Kafka stream threads stuck while sending offsets to transaction preventing 
> join group from completing
> -----------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8673
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8673
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer, streams
>    Affects Versions: 2.2.0
>            Reporter: Varsha Abhinandan
>            Priority: Major
>
> We observed a deadlock kind of a situation in our Kafka streams application 
> when we accidentally shut down all the brokers. The Kafka cluster was brought 
> back in about an hour. 
> Observations made :
>  # Normal Kafka producers and consumers started working fine after the 
> brokers were up again. 
>  # The Kafka streams applications were stuck in the "rebalancing" state.
>  # The Kafka streams apps have exactly-once semantics enabled.
>  # The stack trace showed most of the stream threads sending the join group 
> requests to the group co-ordinator
>  # Few stream threads couldn't initiate the join group request since the call 
> to 
> [org.apache.kafka.clients.producer.KafkaProducer#sendOffsetsToTransaction|https://jira.corp.appdynamics.com/browse/ANLYTCS_ES-2062#sendOffsetsToTransaction%20which%20was%20hung]
>  was stuck.
>  # Seems like the join group requests were getting parked at the coordinator 
> since the expected members hadn't sent their own group join requests
>  # And after the timeout, the stream threads that were not stuck sent a new 
> join group requests.  
>  # Maybe (6) and (7) is happening infinitely
>  # Sample values of the GroupMetadata object on the group co-ordinator  
> !Screen Shot 2019-07-11 at 12.08.09 PM.png|width=319,height=53!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to