Re: [DISCUSS] Incubating Proposal of Fluss

Huajie Wang Mon, 26 May 2025 20:25:40 -0700

Thanks for driving this discussion. Fluss is an interesting project, I
am glad to see Fluss joining the Apache incubator.



Best,
Huajie Wang



Hongshun Wang <loserwang1...@gmail.com> 于2025年5月21日周三 17:57写道：

> Thanks Yu for driving this discussion. I have waiting for a long time.
>
> Best,
> Hongshun
>
> On 2025/05/21 08:43:22 Yu Li wrote:
> > Hi All,
> >
> >
> > I would like to propose Fluss [1] as a new apache incubator project, and
> > you can find the proposal [2] of Fluss for more details.
> >
> >
> > Fluss is a distributed storage service designed to deliver high
> throughput
> > and sub-second latency for streaming read and write operations. It aims
> to
> > provide a unified data layer that bridges real-time processing with data
> > lakehouse architectures. Building real-time analytics pipelines on top
> of a
> > data lakehouse requires key capabilities such as tabular query support,
> > efficient data updates, changelog subscriptions, and the ability to
> > periodically snapshot data into lake file formats like Apache Iceberg and
> > Apache Paimon — functionalities that existing message queue systems such
> as
> > Apache Kafka are not well suited to address.
> >
> >
> > To tackle these challenges, Fluss offers the following features:
> >
> >
> > 1. *Table-Oriented Data Model, Not Topics.* Unlike traditional messaging
> > systems that rely on topics, Fluss treats tables as first-class citizens,
> > aligning its data model with that of modern data lakehouses.
> >
> > 2. *Columnar Stream Storage.* By storing streaming data in a columnar
> > format (specifically Apache Arrow), Fluss achieves up to 10x faster read
> > performance for analytical queries over streaming data.
> >
> > 3. *Real-Time Updates and Changelog Subscription.* Fluss natively
> supports
> > data updates and generates fine-grained changelogs, enabling low-latency
> > incremental stream processing and state synchronization.
> >
> > 4. *Streaming & Lakehouse Unification.* Fluss enhances the stream
> > processing capabilities of lakehouse architectures by seamlessly
> supporting
> > both real-time ingestion and historical analysis within a single system.
> >
> >
> > Fluss is currently deployed in production environments at Alibaba and
> many
> > other companies, where it has reduced total operational costs by up to
> 80%
> > compared to traditional message queue systems in a variety of use cases.
> In
> > addition, the project has gained traction in the open-source community,
> > with active adoption from organizations such as ByteDance, AntGroup,
> > Ververica, eBay, Dynatrace, and Dream11. Many of these users have also
> > contributed code and improvements, helping to form a vibrant and growing
> > community with dozens of active developers.
> >
> >
> > The proposed initial committers are eager to join the Apache Software
> > Foundation (ASF) to foster broader collaboration and further strengthen
> the
> > community. We believe that bringing Fluss into the Apache Incubator will
> > unlock significant value for the broader open-source ecosystem.
> >
> >
> > I am honored to serve as the champion for this project and will mentor it
> > alongside three additional mentors (many thanks to them all):
> >
> >
> > * Becket Qin (j...@apache.org)
> >
> > * Jingsong Lee (lzljs3620...@apache.org)
> >
> > * Zili (Tison) Chen (ti...@apache.org)
> >
> >
> > Look forward to your feedback. Thanks.
> >
> > Best Regards,
> > Yu
> >
> > [1] https://github.com/alibaba/fluss
> >
> > [2] https://cwiki.apache.org/confluence/display/INCUBATOR/FlussProposal
> >

Re: [DISCUSS] Incubating Proposal of Fluss

Reply via email to