Thanks for driving this discussion. Fluss is an interesting project, I am glad to see Fluss joining the Apache incubator.
Best, Huajie Wang Hongshun Wang <loserwang1...@gmail.com> 于2025年5月21日周三 17:57写道: > Thanks Yu for driving this discussion. I have waiting for a long time. > > Best, > Hongshun > > On 2025/05/21 08:43:22 Yu Li wrote: > > Hi All, > > > > > > I would like to propose Fluss [1] as a new apache incubator project, and > > you can find the proposal [2] of Fluss for more details. > > > > > > Fluss is a distributed storage service designed to deliver high > throughput > > and sub-second latency for streaming read and write operations. It aims > to > > provide a unified data layer that bridges real-time processing with data > > lakehouse architectures. Building real-time analytics pipelines on top > of a > > data lakehouse requires key capabilities such as tabular query support, > > efficient data updates, changelog subscriptions, and the ability to > > periodically snapshot data into lake file formats like Apache Iceberg and > > Apache Paimon — functionalities that existing message queue systems such > as > > Apache Kafka are not well suited to address. > > > > > > To tackle these challenges, Fluss offers the following features: > > > > > > 1. *Table-Oriented Data Model, Not Topics.* Unlike traditional messaging > > systems that rely on topics, Fluss treats tables as first-class citizens, > > aligning its data model with that of modern data lakehouses. > > > > 2. *Columnar Stream Storage.* By storing streaming data in a columnar > > format (specifically Apache Arrow), Fluss achieves up to 10x faster read > > performance for analytical queries over streaming data. > > > > 3. *Real-Time Updates and Changelog Subscription.* Fluss natively > supports > > data updates and generates fine-grained changelogs, enabling low-latency > > incremental stream processing and state synchronization. > > > > 4. *Streaming & Lakehouse Unification.* Fluss enhances the stream > > processing capabilities of lakehouse architectures by seamlessly > supporting > > both real-time ingestion and historical analysis within a single system. > > > > > > Fluss is currently deployed in production environments at Alibaba and > many > > other companies, where it has reduced total operational costs by up to > 80% > > compared to traditional message queue systems in a variety of use cases. > In > > addition, the project has gained traction in the open-source community, > > with active adoption from organizations such as ByteDance, AntGroup, > > Ververica, eBay, Dynatrace, and Dream11. Many of these users have also > > contributed code and improvements, helping to form a vibrant and growing > > community with dozens of active developers. > > > > > > The proposed initial committers are eager to join the Apache Software > > Foundation (ASF) to foster broader collaboration and further strengthen > the > > community. We believe that bringing Fluss into the Apache Incubator will > > unlock significant value for the broader open-source ecosystem. > > > > > > I am honored to serve as the champion for this project and will mentor it > > alongside three additional mentors (many thanks to them all): > > > > > > * Becket Qin (j...@apache.org) > > > > * Jingsong Lee (lzljs3620...@apache.org) > > > > * Zili (Tison) Chen (ti...@apache.org) > > > > > > Look forward to your feedback. Thanks. > > > > Best Regards, > > Yu > > > > [1] https://github.com/alibaba/fluss > > > > [2] https://cwiki.apache.org/confluence/display/INCUBATOR/FlussProposal > >