Hi, On Fri, Jan 10, 2020 at 6:47 PM Vinod Kumar Vavilapalli <vino...@apache.org> wrote: > I'd like to call a vote on accepting YuniKorn into the Apache Incubator...
+1 I'm copying the proposal text below, we usually do that to get complete mail archives. -Bertrand YuniKorn proposal Abstract YuniKorn is a standalone resource scheduler responsible for scheduling batch jobs and long-running services on large scale distributed systems running in on-premises environments as well as different public clouds. Proposal YuniKorn ['ju:nikɔ:n] is a unified resource scheduler aiming to achieve fine-grained resource sharing for various workloads efficiently on a large scale, multi-tenant and cloud-native environments. YuniKorn brings a unified, cross-platform scheduling experience for mixed workloads, with support for but not limited to, Apache™ Hadoop® YARN and Kubernetes. YuniKorn is a made-up word (credit to Vinod Kumar Vavilapalli) - it’s made up of Y for Apache™ Hadoop® YARN, K for K8s, Uni for Unified, and its pronunciation is the same as “Unicorn” Currently, YuniKorn is an open-source project with Apache 2.0 license. The source code is hosted as a git-repo under github.com/cloudera domain. We would like to share it with the ASF and expand the community to a wider range of users and contributors. Background Enterprise users run their workloads on different platforms such as Apache™ Hadoop® YARN and Kubernetes. They need to work with different resource schedulers in order to plan their workloads to run on these platforms efficiently. The scheduler implementations are fragmented, and not optimized to balance existing use-cases like batch workloads along with new needs such as cloud-native architecture, autoscaling, etc. We need a single resource planning/management framework to manage resources on different platforms using the same semantics, in order to address all the important resource management requirements. Rationale There is no solution that exists now to address the needs of having a unified resource scheduling experiences across platforms. That makes it difficult to manage workloads running on different environments, from on-premise to Cloud. YuniKorn aims to satisfy these needs. YuniKorn is designed around the following principles: 1) Support different environments As the compute platforms are evolving quickly, there are more and more challenges appears in on-prem, cloud or hybrid environments. YuniKorn aims to bring unified scheduling experiences across multiple environments with enhanced scheduling capabilities. 2) Support extensive type of workloads To improve the efficiency of the computing platform, a key idea is to run different types of applications, like long-running services and batch jobs, on shared resources. YuniKorn is an effort to address all the scheduling features needed for such mixed workload environments. 3) Benefit both big-data and cloud-native communities A resource scheduler needs to be capable of supporting mixed workloads, both batch and long-running services. This is the key to improving cluster utilization, and to reduce the complexity of dev-ops. By creating a common scheduler that is decoupled from the container platforms underneath, it can benefit both Apache™ Hadoop® YARN and the Kubernetes communities. Initial Goals Initial goals are: Move the existing codebase, documentation to Apache hosted repo Setup mailing lists, web-site, CI/CD pipeline under Apache infrastructure Setup JIRA for issue tracking Incremental development and releases according to Apache guidelines Expand the community and bring more diversified contributors/users to the community Current Status Meritocracy Many of the initial developers of YuniKorn are already Apache committers and PMC members from other Apache projects, such as Apache Hadoop and Apache Submarine. Many of us have worked in the Apache Hadoop community for years and know the Apache way well. We believe strongly in meritocracy in electing committers and PMC members. We believe that contributions can come in forms other than just code: for example, one of our initial proposed committers has contributed solely in the area of project documentation. We will encourage contributions and participation of all types, and ensure that contributors are appropriately recognized. Community YuniKorn is a relatively new open source project, Cloudera is the original development sponsor for YuniKorn. From the beginning of the project itself, we had clearly aimed to have this as an open source project, so we started to build the community from the very early stages. We received a lot of feedback and valuable suggestions from other community members while the project was hosted as an open source project on github. This feedback has greatly influenced some of our designs. For e.g, developers from Alibaba had been involved in the very early stage of development, lots of effort related to performance/throughput enhancement were contributed by them. Lots of other organizations further showed their interest to join the community once we started talking about it in meetups, conferences etc. Core developers The project was initiated in Cloudera and so the core developers are heavily from this organization. Tao Yang from Alibaba joined the development at a very early stage. The core developers of YuniKorn are (listed in alphabetical order): Akhil PB (Cloudera) Sunil Govindan (Cloudera) Tao Yang (Alibaba) Vinod Vavilapalli (Cloudera) Wangda Tan (Cloudera) Weiwei Yang (Cloudera) Wilfred Spiegelenburg (Cloudera) Given the origin history, the core development team so far has not been very diverse, but we’ve been attempting to grow that diversity. We have every hope to continue building a diverse and sustainable community if the project gets accepted into Apache. Alignment The motivation of YuniKorn project is to resolve common resource scheduling problems for various workloads, on large scale distributed systems. Apache is home to one of these systems in the form of Apache Hadoop YARN. Many of thee workloads that we expect to leverage YuniKorn are computing engines like Apache Spark, Apache Flink whether they run on top of YARN or on Kubernetes. Known Risks Project Name We have done a search of the name "YuniKorn" on Github, and at the time of the search we found nothing related to resource scheduler or distributed system. We also did a search of the name YuniKorn as a trademark and there seem to be none. A generic web search also didn't return any relevant projects. Since the name seems to be unique, easy to remember, pronounce, and relevant to the project, we believe it is a suitable name even at the ASF. Cloudera does NOT have a trademark on the name YuniKorn, so there is no trademark assignment needed. Cloudera will commit to using Apache YuniKorn as the project name when/if it graduates and becomes an Apache project. Orphaned products The core developers of YuniKorn project from different companies plan to work full time on this project. Currently, the initial team intends to continue the investments on the YuniKorn project, it will be integrated into the solutions to the customers. Several other organizations (like Alibaba) have also started to evaluate the project, and plan to adopt it in their production environments. We anticipate the adoption will be further improved once it becomes an Apache project. We have also got support from core-platform developers and Apache committers who are interested in contributing to YuniKorn project from different companies like Microsoft, Nvidia, Tencent, etc. We’re expecting to see more contributions from these committers and usage by their internal platforms. So overall, the risk of YuniKorn being an orphaned project is low. Inexperience with Open source Most of the core developers in YuniKorn project are experienced open source veterans, several developers are Apache committers and PMC members of other projects, such as Apache™ Hadoop®. And the development style is already very likely the Apache way We have open community meetings to discuss designs, problems and roadmaps We publish all patches and issue related discussions on github We enforce the code review and log all comments in github issues Length of Incubation We started the work 10 months ago, so far the groundwork for YuniKorn is done and the initial version can work with K8s seamlessly. Based on the initial contributers’ experience in ASF projects, we don’t expect that there will be huge gaps before YuniKorn can graduate with regarding to ASF’s policies on software and releases. The goal is to grow the community quickly and increase the user base within a few months while making releases that adhere to the ASF standards. When it reaches a reasonable size of adoption, and a strong community with a good number of committers/PMC members, we can prompt the graduation. We expect the length of incubation to be approximately 12 to 18 months. Homogenous Development The initial proposed list of committers and contributors includes developers from several institutions and industry participants. The developers are also from different regions like U.S, Australia, India, and the development team leverages slack, community mailing list, weekly community calls to collaborate efficiently. Reliance on Salaried Developers Clearly, Cloudera has contributed most of the initial development through salaried developers. But since the very beginning, YuniKorn is built as a community effort project. We have people from other organizations that are already collaborating with us on github. This includes both at the source code level, as well as participating in designs and providing feedback through community calls. We expect our reliance on salaried developers to decrease drastically during the incubation process itself. Relationship to Other Apache Products YuniKorn is very closely related to other Big-Data projects in Apache, such as Hadoop YARN, Spark, Hive, Flink, etc. YuniKorn’s core idea is to support both long-running and batch workloads like Spark, Hive, Flink etc, and provide a consistent, unified way to manage and schedule resources for Big Data workloads across resource managers like Apache™ Hadoop® YARN / Kubernetes and on-premise and cloud environments. Many of the core ideas for YuniKorn come from the experience of the initial team building Apache Hadoop YARN’s schedulers - Capacity Scheduler and Fair Scheduler. An Excessive Fascination with the Apache Brand Many of the initial developers in YuniKorn project are already experienced Apache committers, PMC members. We understand the value of the Apache way, and how to operate the project development on a day to day basis. The reason for proposing YuniKorn as an Apache project is to build a healthy community, increasing adoption & the size of the community and end users, because we believe the only way to build a highly valuable infrastructure layer software is to have wide adoption and cater to common use cases. Documentation Project summary: https://github.com/cloudera/yunikorn-core/blob/master/README.md User guides https://github.com/cloudera/yunikorn-core/blob/master/docs/user-guide.md Developer guides https://github.com/cloudera/yunikorn-core/blob/master/docs/developer-guide.md Roadmap: https://github.com/cloudera/yunikorn-core/blob/master/docs/roadmap.md Initial Source YuniKorn is written in Golang, and currently, the source code is hosted in several GitHub repositories Scheduler interface: https://github.com/cloudera/yunikorn-scheduler-interface Scheduler core: https://github.com/cloudera/yunikorn-core K8s Shim:https://github.com/cloudera/yunikorn-k8shim Scheduler Web UI: https://github.com/cloudera/yunikorn-web Source and Intellectual Property Submission Plan External Dependencies External dependencies are listed in below table Library Type License k8s.io/api K8s API Apache License 2.0 k8s.io/apimachinery K8s API Apache License 2.0 k8s.io/client-go K8s client library Apache License 2.0 github.com/looplab/fsm Go state machine library MIT License github.com/satori/go.uuid Go UUID library MIT License github.com/uber-go/zap Go logging library MIT License github.com/golang/protobuf Go protobuf library BSD 3-Clause License github.com/gorilla/mux Go network library BSD 3-Clause License google.golang.org/grpc Go RPC library Apache License 2.0 gopkg.in/yaml.v2 Go YAML library Apache License 2.0 github.com/prometheus/client_golang Prometheus Client Library Apache License 2.0 Angular v6.1.x Angular UI Framework Libraries MIT License TypeScript TypeScript Language Compiler Apache License 2.0 Chart.js JavaScript Charting Library MIT License Moment.js JavaScript Date & Time Library MIT License Build and test only: gotest.tools Test library Apache License 2.0 github.com/stretchr/testify Test library MIT License Karma Unit test library MIT License Protactor End2End test library MIT License Json-server Test server MIT License Yarn Dependency manager BSD 2-Clause License Cryptography YuniKorn does not currently include any cryptography-related code. Required Resources Mailing lists: priv...@yunikorn.incubator.apache.org (PMC list) comm...@yunikorn.incubator.apache.org (git push emails) iss...@yunikorn.incubator.apache.org (JIRA issue feed) d...@yunikorn.incubator.apache.org (Dev discussion) u...@yunikorn.incubator.apache.org (User questions) Git Repositories Git is the preferred source control system git://git.apache.org/yunikorn-* (We have multiple git repositories) Issue Tracking JIRA YuniKorn (YUNIKORN-) Other Resources None Initial Committers and Affinities Akhil PB (a...@cloudera.com) (Cloudera) Sunil Govindan (sun...@apache.org) (Cloudera) Vinod Kumar Vavilapalli (vino...@apache.org) (Cloudera) Wangda Tan (wan...@apache.org) (Cloudera) Weiwei Yang (w...@apache.org) (Cloudera) Wilfred Spiegelenburg (wspiegelenb...@cloudera.com) (Cloudera) Carlo Curino (cur...@apache.org) (Microsoft) Subramaniam Krishnan (su...@apache.org) (Microsoft) Arun Suresh (asur...@apache.org) (Microsoft) Konstantinos Karanasos (kkarana...@apache.org) (Microsoft) Jonathan Hung (jh...@apache.org) (LinkedIn) DB Tsai (dbt...@apache.org) (Apple) Junping Du (junping...@apache.org) (Tencent) Tao Yang (taoy...@apache.org) (Alibaba) Jason Lowe (jl...@apache.org) (Nvidia) Sponsors Champion Vinod Kumar Vavilapalli (vino...@apache.org) Nominated Mentors Junping Du (Tencent), (junping...@apache.org) Felix Cheung (Uber), (felixche...@apache.org) Jason Lowe (Nvidia), (jl...@apache.org) Holden Karau (Apple), (hol...@apache.org) Sponsoring Entity The Apache Incubator --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org