[
https://issues.apache.org/jira/browse/KAFKA-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias J. Sax updated KAFKA-12462:
------------------------------------
Affects Version/s: 2.7.0
> Threads in PENDING_SHUTDOWN entering a rebalance can cause an illegal state
> exception
> --------------------------------------------------------------------------------------
>
> Key: KAFKA-12462
> URL: https://issues.apache.org/jira/browse/KAFKA-12462
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 2.7.0, 2.8.0
> Reporter: Walker Carlson
> Assignee: A. Sophie Blee-Goldman
> Priority: Blocker
> Labels: streams
> Fix For: 2.8.0, 2.7.1
>
>
> A thread was removed, sending it to the PENDING_SHUTDOWN state, but went
> through a rebalance before completing the shutdown.
> {code:java}
> // [2021-03-07 04:33:39,385] DEBUG [i-07430efc31ad166b7-StreamThread-6]
> stream-thread [i-07430efc31ad166b7-StreamThread-6] Ignoring request to
> transit from PENDING_SHUTDOWN to PARTITIONS_REVOKED: only DEAD state is a
> valid next state (org.apache.kafka.streams.processor.internals.StreamThread)
> {code}
> Inside StreamsRebalanceListener#onPartitionsRevoked, we have
> {code:java}
> //
> if (streamThread.setState(State.PARTITIONS_REVOKED) != null &&
> !partitions.isEmpty())
> taskManager.handleRevocation(partitions);
> {code}
> Since PENDING_SHUTDOWN → PARTITIONS_REVOKED is a disallowed transition, we
> never invoke TaskManager#handleRevocation. Currently handleRevocation is
> responsible for preparing any active tasks for close, including committing
> offsets and writing the checkpoint as well as suspending the task. We can’t
> close the task in handleRevocation since we still support EAGER rebalancing,
> which invokes handleRevocation at the beginning of a rebalance on all tasks.
> The tasks that are actually revoked will be closed during
> TaskManager#handleAssignment . The IllegalStateException is specifically
> because we don’t suspend the task before attempting to close it, and the
> direct transition from RUNNING → CLOSED is forbidden.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)