[
https://issues.apache.org/jira/browse/KAFKA-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sophie Blee-Goldman updated KAFKA-8767:
---------------------------------------
Parent: (was: KAFKA-8179)
Issue Type: Improvement (was: Sub-task)
> Optimize StickyAssignor for Cooperative mode
> --------------------------------------------
>
> Key: KAFKA-8767
> URL: https://issues.apache.org/jira/browse/KAFKA-8767
> Project: Kafka
> Issue Type: Improvement
> Components: clients, consumer
> Affects Versions: 2.4.0
> Reporter: Sophie Blee-Goldman
> Priority: Major
>
> In some rare cases, the StickyAssignor will fail to balance an assignment
> without violating stickiness despite a balanced and sticky assignment being
> possible. The implications of this for cooperative rebalancing are that an
> unnecessary additional rebalance will be triggered.
> This was seen to happen for example when each consumer is subscribed to some
> random subset of all topics and all their subscriptions change to a different
> random subset, as occurs in
> AbstractStickyAssignorTest#testReassignmentWithRandomSubscriptionsAndChanges.
> The initial assignment after the random subscription change obviously
> involved migrating partitions, so following the cooperative protocol those
> partitions are removed from the balanced first assignment, and a second
> rebalance is triggered. In some cases, during the second rebalance the
> assignor was unable to reach a balanced assignment without migrating a few
> partitions, even though one must have been possible (since the first
> assignment was balanced). A third rebalance was needed to reach a stable
> balanced state.
> Under the conditions in the previously mentioned test (between 20-40
> consumers, 10-20 topics (with 0-20 partitions) this third rebalance was
> required roughly 30% of the time. Some initial improvements to the sticky
> assignment logic reduced this to under 15%, but we should consider closing
> this gap and optimizing the cooperative sticky assignment
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)