[jira] [Commented] (KAFKA-19415) Improve fairness of partition fetch order when clients fetch data to avoid partition starvation

corkitse (Jira) Mon, 28 Jul 2025 02:16:04 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18010309#comment-18010309
 ]


corkitse commented on KAFKA-19415:
----------------------------------

To Mahesh kumar gaddam，Thank you for your attention. I've already submitted a 
pull request with the suggested code changes. You can follow the post for 
updates on its status.
{panel}
 {panel}

> Improve fairness of partition fetch order when clients fetch data to avoid 
> partition starvation
> -----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-19415
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19415
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: corkitse
>            Priority: Minor
>
> Currently, in ReplicaManager.readFromLog, the fetch order of partitions is 
> fixed for each request. When the first few partitions have a large backlog, 
> they may consume all the allowed bytes (maxBytes), causing partitions at the 
> end of the list to be starved and not return data in this fetch cycle. This 
> can lead to persistent high latency for some partitions, especially under 
> heavy load and when the partition order is stable.  
> This behavior breaks the relative independence of partitions and may cause 
> resource utilization to be suboptimal.
>  
> *Reference code*
> {code:java}
> readPartitionInfo.foreach { case (tp, fetchInfo) =>
>   val readResult = read(tp, fetchInfo, limitBytes, minOneMessage)
>   val recordBatchSize = readResult.info.records.sizeInBytes
>   // Once we read from a non-empty partition, we stop ignoring request and 
> partition level size limits
>   if (recordBatchSize > 0)
>     minOneMessage = false
>   limitBytes = math.max(0, limitBytes - recordBatchSize)
>   result += (tp -> readResult)
> } {code}
> *Proposed Change*
> Shuffle the order of readPartitionInfo before iteration to avoid always 
> reading partitions in a fixed order.
>  
> *Motivation*
> 1 Avoids starvation of later partitions when earlier partitions have message 
> backlog, ensuring all partitions have a chance to be served.
> 2 Improves fairness and resource utilization in brokers serving multiple 
> partitions.
> 3 Has negligible impact on Kafka performance, as shuffling a small list of 
> partitions is very lightweight.
>  
> see https://github.com/apache/kafka/pull/19990
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-19415) Improve fairness of partition fetch order when clients fetch data to avoid partition starvation

Reply via email to