[
https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868018#comment-17868018
]
Ramiz Mehran commented on KAFKA-16582:
--------------------------------------
Yes [~xiaodoujiang] .
The reason is basically large payloads. We want a limit before compression, so
that we can limit making such large batches.
> Feature Request: Introduce max.record.size Configuration Parameter for
> Producers
> --------------------------------------------------------------------------------
>
> Key: KAFKA-16582
> URL: https://issues.apache.org/jira/browse/KAFKA-16582
> Project: Kafka
> Issue Type: New Feature
> Components: producer
> Affects Versions: 3.6.2
> Reporter: Ramiz Mehran
> Priority: Major
>
> {*}Summary{*}:
> Currently, Kafka producers have a {{max.request.size}} configuration that
> limits the size of the request sent to Kafka brokers, which includes both
> compressed and uncompressed data sizes. However, it is also the maximum size
> of an individual record before it is compressed. This can lead to
> inefficiencies and unexpected behaviours, particularly when records are
> significantly large before compression but fit multiple times into the
> {{max.request.size}} after compression.
> {*}Problem{*}:
> During spikes in data transmission, especially with large records, even when
> compressed within the limits of {{{}max.request.size{}}}, it causes an
> increase in latency and potential backlog in processing due to the large
> batch sizes formed by compressed records. This problem is particularly
> pronounced when using highly efficient compression algorithms like zstd,
> where the compressed size may allow for large batches that are inefficient to
> process.
> {*}Proposed Solution{*}:
> Introduce a new producer configuration parameter: {{{}max.record.size{}}}.
> This parameter will allow administrators to define the maximum size of a
> record before it is compressed. This would help in managing expectations and
> system behavior more predictably by separating uncompressed record size limit
> from compressed request size limit.
> {*}Benefits{*}:
> # {*}Predictability{*}: Producers can reject records that exceed the
> {{max.record.size}} before spending resources on compression.
> # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system
> throughput, especially under high load conditions.
> # {*}System Stability{*}: Avoids the potential for large batch processing
> which can affect latency and throughput negatively.
> {*}Example{*}: Consider a scenario where the producer sends records up to 20
> MB in size which, when compressed, fit into a batch under the 25 MB
> {{max.request.size }}multiple times. These batches can be problematic to
> process efficiently, even though they meet the current maximum request size
> constraints. With {{{}max.record.size{}}}, we could separate max.request.size
> to only limit compressed request size creation, thus helping us limit that to
> say 5 MB. Thus, preventing very large requests being made, and causing
> latency spikes.
> {*}Steps to Reproduce{*}:
> # Configure a Kafka producer with {{max.request.size}} set to 25 MB.
> # Send multiple uncompressed records close to 20 MB that compress to less
> than 25 MB.
> # Observe the impact on Kafka broker performance and client side latency.
> {*}Expected Behavior{*}: The producer should allow administrators to set both
> pre-compression record size limits and total request size limits post
> compression.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)