+1 (non-binding)

Good luck!

Daniel Widdis <wid...@gmail.com> 于2022年5月25日周三 09:53写道:

> +1 (non-binding) from me!  Good luck!
>
> On 5/24/22, 9:05 AM, "Jerry Shao" <js...@apache.org> wrote:
>
>     Hi all,
>
>     Due to the name issue in thread (
>     https://lists.apache.org/thread/y07xjkqzvpchncym9zr1hgm3c4l4ql0f), we
>     figured out a new project name "Uniffle" and created a new Thread.
> Please
>     help to discuss.
>
>     We would like to propose Uniffle[1] as a new Apache incubator project,
> you
>     can find the proposal here [2] for more details.
>
>     Uniffle is a high performance, general purpose Remote Shuffle Service
> for
>     distributed compute engines like Apache Spark
>     <https://spark.apache.org/>, Apache
>     Hadoop MapReduce <https://hadoop.apache.org/>, Apache Flink
>     <https://flink.apache.org/> and so on. We are aiming to make
> Firestorm a
>     universal shuffle service for distributed compute engines.
>
>     Shuffle is the key part for a distributed compute engine to exchange
> the
>     data between distributed tasks, the performance and stability of
> shuffle
>     will directly affect the whole job. Current “local file pull-like
> shuffle
>     style” has several limitations:
>
>        1. Current shuffle is hard to support super large workloads,
> especially
>        in a high load environment, the major problem is IO problem (random
> disk IO
>        issue, network congestion and timeout).
>        2. Current shuffle is hard to deploy on the disaggregated compute
>        storage environment, as disk capacity is quite limited on compute
> nodes.
>        3. The constraint of storing shuffle data locally makes it hard to
> scale
>        elastically.
>
>     Remote Shuffle Service is the key technology for enterprises to build
> big
>     data platforms, to expand big data applications to disaggregated,
>     online-offline hybrid environments, and to solve above problems.
>
>     The implementation of Remote Shuffle Service -  “Uniffle”  - is heavily
>     adopted in Tencent, and shows its advantages in production. Other
>     enterprises also adopted or prepared to adopt Firestorm in their
>     environments.
>
>     Uniffle's key idea is brought from Salfish shuffle
>     <
> https://www.researchgate.net/publication/262241541_Sailfish_a_framework_for_large_scale_data_processing
> >,
>     it has several key design goals:
>
>        1. High performance. Firestorm’s performance is close enough to
> local
>        file based shuffle style for small workloads. For large workloads,
> it is
>        far better than the current shuffle style.
>        2. Fault tolerance. Firestorm provides high availability for
> Coordinated
>        nodes, and failover for Shuffle nodes.
>        3. Pluggable. Firestorm is highly pluggable, which could be suited
> to
>        different compute engines, different backend storages, and different
>        wire-protocols.
>
>     We believe that Uniffle project will provide the great value for the
>     community if it is accepted by the Apache incubator.
>
>     I will help this project as champion and many thanks to the 3 mentors:
>
>        -
>
>        Felix Cheung (felixche...@apache.org)
>        - Junping du (junping...@apache.org)
>        - Weiwei Yang (w...@apache.org)
>        - Xun liu (liu...@apache.org)
>        - Zhankun Tang (zt...@apache.org)
>
>
>     [1] https://github.com/Tencent/Firestorm
>     [2]
> https://cwiki.apache.org/confluence/display/INCUBATOR/UniffleProposal
>
>     Best regards,
>     Jerry
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>

Reply via email to