Rohit Bobade created KAFKA-17380:
------------------------------------
Summary: Kafka Streams few partition stuck in processing - fixed
after restart
Key: KAFKA-17380
URL: https://issues.apache.org/jira/browse/KAFKA-17380
Project: Kafka
Issue Type: Bug
Components: streams
Affects Versions: 2.6.2
Reporter: Rohit Bobade
Using Kafka Streams 2.6.2 and running stateful aggregations with Exactly once
semantics.
The processing logic is:
consume input records -> intermediate aggregate and buffer data in state store
backed by change log topic -> punctuate every 15seconds - flush state store and
send aggregated records downstream -> final aggregate operation and send to
output topic
Since we use spot instances, one of the pod got restarted and rebalance was
triggered.
we noticed ProducerFenced exceptions:
{quote}org.apache.kafka.common.errors.ProducerFencedException: Producer
attempted an
operation with an old epoch. Either there is a newer producer with the same
transactionalId, or the producer's transaction has been expired by the broker.
{quote}
After this a few partitions were stuck and no records were processed util we
restarted the application.
We had configured:
transaction.timeout.ms to 30 seconds
session.timeout.ms to 30 seconds
could you please advise if there's any known fix for this edge case?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)