[
https://issues.apache.org/jira/browse/KAFKA-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jialun Peng updated KAFKA-19507:
--------------------------------
External issue URL:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1194%3A+Optimize+Replica+Assignment+for+Broker+Load+Balance+in+Uneven+Rack+Configurations
Labels: KIP-1194 (was: )
> Optimize Replica Assignment for Broker Load Balance in Uneven Rack
> Configurations
> ---------------------------------------------------------------------------------
>
> Key: KAFKA-19507
> URL: https://issues.apache.org/jira/browse/KAFKA-19507
> Project: Kafka
> Issue Type: Improvement
> Reporter: Jialun Peng
> Assignee: Jialun Peng
> Priority: Major
> Labels: KIP-1194
>
> h3. Issue Description
> Kafka's current replica assignment strategy prioritizes _balancing replica
> counts across racks_ (availability zones in cloud environments) over
> _balancing replicas across individual brokers_. While this ensures rack
> diversity, it creates significant broker-level load imbalance when racks
> contain unequal numbers of brokers.
> h3. Problem Illustration
> Consider a 3-replica topic with 3 racks:
> * *Rack A*: Brokers 1, 4
> * *Rack B*: Brokers 2, 5
> * *Rack C*: Broker 3 (single broker)
> Under the current strategy:
> * Brokers 1, 2, 4, 5 each receive 1/6 of all replicas
> * Broker 3 receives 1/3 of all replicas (twice the load of others)
> This forces Broker 3 into a bottleneck ("bucket effect"), as it handles
> double the traffic and storage load.
>
> To mitigate this, deployments today must maintain broker counts as _multiples
> of rack counts_ (e.g., 3, 6, 9 brokers for 3 racks). While this ensures
> balance, it:
> # *Restricts deployment flexibility*: Scaling clusters horizontally requires
> adding/removing nodes in rack-sized increments.
> # *Increases costs unnecessarily*: For example, a 4-broker cluster could
> suffice for a 3-rack setup, but users must deploy 6 brokers to maintain
> balance—increasing infrastructure costs by 50%.
> h3. Proposed Solution
> Modify the assignment strategy to:
> # *Prioritize broker-level balance* as the primary objective.
> # *Weight rack-level distribution* by broker count per rack (e.g., a rack
> with 2 brokers receives twice the replicas of a rack with 1 broker).
> h4. Benefits
> * *Balanced load*: All brokers receive near-equal replicas regardless of
> rack imbalance.
> * *Deployment flexibility*: Clusters can scale to _any size_ as long as
> {{rack_count ≥ replica_factor}}.
> * *Cost efficiency*: Users deploy only necessary brokers.
> h4. Example Scenario
> _3 replicas, 4 racks with 5 brokers:_
> * *Rack A*: Brokers 1, 5 → Receives 2/5 of replicas (distributed evenly
> between Brokers 1 & 5)
> * *Racks B, C, D*: 1 broker each → Each receives 1/5 of replicas _Result_:
> Every broker handles exactly 1/5 of total replicas—eliminating bottlenecks.
> h3. Request
> We propose modifying the replica assignment algorithm to prioritize
> broker-level replica balance, while using rack-node-count-weighted
> distribution. This allows enterprises to deploy Kafka clusters with more
> flexible node counts, significantly improving cost efficiency while
> maintaining rack awareness.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)