[
https://issues.apache.org/jira/browse/KAFKA-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bruno Cadonna reassigned KAFKA-10015:
-------------------------------------
Assignee: Bruno Cadonna
> React Smartly to Unexpected Errors on Stream Threads
> ----------------------------------------------------
>
> Key: KAFKA-10015
> URL: https://issues.apache.org/jira/browse/KAFKA-10015
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Reporter: Bruno Cadonna
> Assignee: Bruno Cadonna
> Priority: Major
> Labels: needs-kip
>
> Currently, if an unexpected error occurs on a stream thread, the stream
> thread dies, a rebalance is triggered, and the Streams' client continues to
> run with less stream threads.
>
> Some errors trigger a cascading of stream thread death, i.e., after the
> rebalance that resulted from the death of the first thread the next thread
> dies, then a rebalance is triggered, the next thread dies, and so forth until
> all stream threads are dead and the instance shuts down. Such a chain of
> rebalances could be avoided if an error could be recognized as the cause of
> cascading stream deaths and as a consequence the Streams' client could be
> shut down after the first stream thread death.
> On the other hand, some unexpected errors are transient and the stream thread
> could safely be restarted without causing further errors and without the need
> to restart the Streams' client.
> The goal of this ticket is to classify errors and to automatically react to
> the errors in a way to avoid cascading deaths and to recover stream threads
> if possible.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)