Hi all, Thanks a lot for your suggestions and supports. This thread is opened for almost 7 days, I'm going to close it and create a new vote thread.
Thanks Jerry Aloys Zhang <aloyszh...@apache.org> 于2022年5月27日周五 06:29写道: > +1 (non-binding) good luck > > Zhang Yonglun <zhangyong...@apache.org> 于2022年5月26日周四 20:52写道: > > > +1 (non-binding) > > > > -- > > > > Zhang Yonglun > > Apache ShenYu (Incubating) > > Apache ShardingSphere > > > > Jerry Shao <js...@apache.org> 于2022年5月25日周三 00:07写道: > > > > > > Hi all, > > > > > > Due to the name issue in thread ( > > > https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f), we > > > figured out a new project name "Uniffle" and created a new Thread. > Please > > > help to discuss. > > > > > > We would like to propose Uniffle[1] as a new Apache incubator project, > > you > > > can find the proposal here [2] for more details. > > > > > > Uniffle is a high performance, general purpose Remote Shuffle Service > for > > > distributed compute engines like Apache Spark > > > <https://spark.apache.org/>, Apache > > > Hadoop MapReduce <https://hadoop.apache.org/>, Apache Flink > > > <https://flink.apache.org/> and so on. We are aiming to make > Firestorm a > > > universal shuffle service for distributed compute engines. > > > > > > Shuffle is the key part for a distributed compute engine to exchange > the > > > data between distributed tasks, the performance and stability of > shuffle > > > will directly affect the whole job. Current “local file pull-like > shuffle > > > style” has several limitations: > > > > > > 1. Current shuffle is hard to support super large workloads, > > especially > > > in a high load environment, the major problem is IO problem (random > > disk IO > > > issue, network congestion and timeout). > > > 2. Current shuffle is hard to deploy on the disaggregated compute > > > storage environment, as disk capacity is quite limited on compute > > nodes. > > > 3. The constraint of storing shuffle data locally makes it hard to > > scale > > > elastically. > > > > > > Remote Shuffle Service is the key technology for enterprises to build > big > > > data platforms, to expand big data applications to disaggregated, > > > online-offline hybrid environments, and to solve above problems. > > > > > > The implementation of Remote Shuffle Service - “Uniffle” - is heavily > > > adopted in Tencent, and shows its advantages in production. Other > > > enterprises also adopted or prepared to adopt Firestorm in their > > > environments. > > > > > > Uniffle's key idea is brought from Salfish shuffle > > > < > > > https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing > > >, > > > it has several key design goals: > > > > > > 1. High performance. Firestorm’s performance is close enough to > local > > > file based shuffle style for small workloads. For large workloads, > it > > is > > > far better than the current shuffle style. > > > 2. Fault tolerance. Firestorm provides high availability for > > Coordinated > > > nodes, and failover for Shuffle nodes. > > > 3. Pluggable. Firestorm is highly pluggable, which could be suited > to > > > different compute engines, different backend storages, and different > > > wire-protocols. > > > > > > We believe that Uniffle project will provide the great value for the > > > community if it is accepted by the Apache incubator. > > > > > > I will help this project as champion and many thanks to the 3 mentors: > > > > > > - > > > > > > Felix Cheung (felixche...@apache.org) > > > - Junping du (junping...@apache.org) > > > - Weiwei Yang (w...@apache.org) > > > - Xun liu (liu...@apache.org) > > > - Zhankun Tang (zt...@apache.org) > > > > > > > > > [1] https://github.com/Tencent/Firestorm > > > [2] > > https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal > > > > > > Best regards, > > > Jerry > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > >