[
https://issues.apache.org/jira/browse/KAFKA-19225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christo Lolov updated KAFKA-19225:
----------------------------------
Fix Version/s: (was: 4.0.1)
> Tiered Storage Support for Active Log Segment
> ---------------------------------------------
>
> Key: KAFKA-19225
> URL: https://issues.apache.org/jira/browse/KAFKA-19225
> Project: Kafka
> Issue Type: New Feature
> Components: Tiered-Storage
> Affects Versions: 4.0.0
> Reporter: Henry Cai
> Assignee: Henry Cai
> Priority: Major
>
> This is the Jira for
> [KIP-1176|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+Segment]
> In KIP-405, the community has proposed and implemented the tiered storage for
> old Kafka log segment files, when the log segments is older than
> {_}local.retention.ms{_}, it becomes eligible to be uploaded to cloud's
> object storage and removed from the local storage thus reducing local storage
> cost. KIP-405 only uploads older log segments but not the most recent active
> log segments (write-ahead logs). Thus in a typical 3-way replicated Kafka
> cluster, the 2 follower brokers would still need to replicate the active log
> segments from the leader broker. It is common practice to set up the 3
> brokers in three different AZs to improve the high availability of the
> cluster. This would cause the replications between leader/follower brokers to
> be across AZs which is a significant cost ([various
> studies|https://www.confluent.io/blog/understanding-and-optimizing-your-kafka-costs-part-1-infrastructure/]
> show the across AZ transfer cost typically comprises 50%-60% of the total
> cluster cost). Since all the active log segments are physically present on
> three Kafka Brokers, they still comprise significant resource usage on the
> brokers. The state of the broker is still quite big during node replacement,
> leading to longer node replacement time.
> [KIP-1150|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics]
> recently proposes diskless Kafka topic, but leads to increased latency and a
> significant redesign. In comparison, this proposed KIP maintains identical
> performance for acks=1 producer path, minimizes design changes to Kafka, and
> still slashes cost by an estimated 43%.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)