[
https://issues.apache.org/jira/browse/KAFKA-13555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950376#comment-17950376
]
Matthias J. Sax commented on KAFKA-13555:
-----------------------------------------
Good questions... Would be good to also hear from [~lucasbru] (\cc [~cadonna]
[~ableegoldman])
With KIP-1071, I am wondering if we would want to make such improvements on the
broker now?
In general, it's not required that both built-in assignors do the same thing,
as they have different goals to begin with; of course, it's better if they
consider the same things, but we can also do this step-by-step.
{quote}There are also aspects of this which I am not entriely sure of (for
example determining in the code what an input partition is, as opposed to other
types of partitions) and have made a best guess.
{quote}
Not sure what you exactly mean by this? What other types of partitions are you
referring, too?
{quote}I'm happy to give a high level overview of all the changes I've made if
that would be helpful, either here or on Github.
{quote}
This sound like easiest to do an a PR – you can always open it as a draft.
> Consider number if input topic partitions for task assignment
> -------------------------------------------------------------
>
> Key: KAFKA-13555
> URL: https://issues.apache.org/jira/browse/KAFKA-13555
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Reporter: Matthias J. Sax
> Assignee: Lorcan
> Priority: Major
>
> StreamsAssignor tries to distribute tasks evenly across all instances/threads
> of a Kafka Streams application. It knows about instances/thread (to give more
> capacity to instances with more thread), and it distinguishes between
> stateless and stateful tasks. We also try to not move state around but to use
> a sticky assignment if possible. However, the assignment does not take the
> number of input topic partitions into account.
> For example, an upstream tasks could compute two joins, and thus has 3 input
> partitions, while a downstream task compute a follow up aggregation with a
> single input partitions (from the repartition topic). It could happen that
> one thread gets the 3 input partition tasks assigned, while the other thread
> get the single input partition tasks assigned resulting to an uneven
> partition assignment across both threads.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)