RE: [DISCUSS] Incubating Proposal of Fluss

Michael Koepf Thu, 22 May 2025 03:37:29 -0700

Hi,

To reinforce the significance of Fluss -- and to better illustrate how it 
differs from existing systems -- I want to share this figure with the community 
(courtesy of Jark): 
https://alibaba.github.io/fluss-docs/assets/images/img7-06886bca9797751895c82d707cb04b2d.jpg

Being a streaming storage with a columnar storage format, Fluss fills the gap 
in the upper right quadrant, making it an ideal fit for real-time analytical 
use cases.

Since Fluss has been open-sourced, it also has established a very active 
community.

Looking forward to the incubation of Fluss.

--
Best,
MKO

GitHub: michaelkoepf

On 2025/05/21 08:43:22 Yu Li wrote:
> Hi All,
>
>
> I would like to propose Fluss [1] as a new apache incubator project, and
> you can find the proposal [2] of Fluss for more details.
>
>

> Fluss is a distributed storage service designed to deliver highthroughput> and sub-second latency for streaming read and write operations. Itaims to

> provide a unified data layer that bridges real-time processing with data

> lakehouse architectures. Building real-time analytics pipelines ontop of a

> data lakehouse requires key capabilities such as tabular query support,
> efficient data updates, changelog subscriptions, and the ability to
> periodically snapshot data into lake file formats like Apache Iceberg and

> Apache Paimon — functionalities that existing message queue systemssuch as

> Apache Kafka are not well suited to address.
>
>
> To tackle these challenges, Fluss offers the following features:
>
>
> 1. *Table-Oriented Data Model, Not Topics.* Unlike traditional messaging
> systems that rely on topics, Fluss treats tables as first-class citizens,
> aligning its data model with that of modern data lakehouses.
>
> 2. *Columnar Stream Storage.* By storing streaming data in a columnar
> format (specifically Apache Arrow), Fluss achieves up to 10x faster read
> performance for analytical queries over streaming data.
>

> 3. *Real-Time Updates and Changelog Subscription.* Fluss nativelysupports

> data updates and generates fine-grained changelogs, enabling low-latency
> incremental stream processing and state synchronization.
>
> 4. *Streaming & Lakehouse Unification.* Fluss enhances the stream

> processing capabilities of lakehouse architectures by seamlesslysupporting

> both real-time ingestion and historical analysis within a single system.
>
>

> Fluss is currently deployed in production environments at Alibaba andmany> other companies, where it has reduced total operational costs by upto 80%> compared to traditional message queue systems in a variety of usecases. In

> addition, the project has gained traction in the open-source community,
> with active adoption from organizations such as ByteDance, AntGroup,
> Ververica, eBay, Dynatrace, and Dream11. Many of these users have also
> contributed code and improvements, helping to form a vibrant and growing
> community with dozens of active developers.
>
>
> The proposed initial committers are eager to join the Apache Software

> Foundation (ASF) to foster broader collaboration and furtherstrengthen the

> community. We believe that bringing Fluss into the Apache Incubator will
> unlock significant value for the broader open-source ecosystem.
>
>
> I am honored to serve as the champion for this project and will mentor it
> alongside three additional mentors (many thanks to them all):
>
>
> * Becket Qin (j...@apache.org)
>
> * Jingsong Lee (lzljs3620...@apache.org)
>
> * Zili (Tison) Chen (ti...@apache.org)
>
>
> Look forward to your feedback. Thanks.
>
> Best Regards,
> Yu
>
> [1] https://github.com/alibaba/fluss
>
> [2] https://cwiki.apache.org/confluence/display/INCUBATOR/FlussProposal
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

RE: [DISCUSS] Incubating Proposal of Fluss

Reply via email to