Greetings!

We would like to start an open discussion on bringing Linkis (
https://github.com/WeBankFinTech/Linkis), a computation middleware project,
to the Apache Incubator.


The proposal can be found below and is also listed in the Incubator wiki:
https://cwiki.apache.org/confluence/display/INCUBATOR/LinkisProposal. We
appreciate anyone who would give guidance or be willing to support us as an
additional mentor.


======
Linkis Proposal




=Abstract=

Linkis builds a computation middleware layer to decouple the upper
applications and the underlying data engines, provides standardized
interfaces (REST, JDBC, WebSocket etc.) to easily connect to various
underlying engines (Spark, Presto, Flink, etc.), while enables cross engine
context sharing, unified job& engine governance and orchestration.

Linkis codebase: https://github.com/WeBankFinTech/Linkis




=Proposal=

Linkis is designed to solve computation governance problems in complex
distributed environments (typically in a big data platform), where you have
to deal with different types, versions, or clusters of underlying data
engines and hundreds of diversified engine clients at the upper application
layer as well.

Linkis acts as a proxy between the upper applications layer and underlying
engines layer. By abstracting and implementing the 3 common phases of a
job/request for submit, prepare and execute, Linkis is able to facilitate
the connectivity, governance and orchestration capabilities of different
kind of engines like OLAP, OLTP (developing), Streaming, and handle all
these "computation governance" affairs in a standardized reusable way.

We are actively operating the Linkis community and we are looking forward
to increase community activity continuously.

We propose to contribute the Linkis codebase to the Apache Software
Foundation. We believe that bringing Linkis into Apache Software Foundation
and following the COMMUNITY-LED DEVELOPMENT "APACHE WAY" could continuously
improve project quality and community vitality.




=Background=

In today's complex and distributed environment, the communication,
coordination and governance of application services have developed mature
solutions from SOA to micro-services, and many practices from ESB to
Service Mesh to decouple different services.

However, things go different while an application service needs to
communicate with the underlying engines. Engines are isolated from each
other, and the client-server tight coupling pattern goes everywhere. Each
and every upper application has to directly connect to and access various
underlying engines in a tightly coupled way, and solves the "computation
governance" problems on its own, including maintaining different client
environments, submiting the job, monitoring job status, fetching the
output, handling large number of concurrent client instances, watching the
bad jobs, adapt to engine version changes, etc.

It lacks a common layer of "computation middleware" between the numerous
upper-layer applications and the countless underlying engines to handle all
these "computation governance" affairs in a standardized reusable way,
that's why we started the Linkis project.

Firstly, Linkis could reduce the complexity of connectivity. Instead of
maintaining a variety of engine client environments, users now only need to
install the Linkis client, or even just HTTP client while using the REST
interface. Routing query to desired clusters could be done by simply
providing a tag.

Secondly, Links provides governance capabilities such as multi-tenancy,
concurrency control, resource management, query validation, privilege
enhancement and auditing.

Meanwhile, Linkis enables orchestration strategies such as routing,
load-balance, active-active and hybrid computation across engines (some
still under development).




=Rationale=

Linkis is built on distributed microservice architecture with great
scalability and extendibility. The enhancements of high concurrency and
fault tolerance make it more stable and reliable. It has already supported
many production environments with large number of daily jobs over a long
term.

Linkis's microservices are divided into 3 groups: Computation Governance
Services, Public Enhancement Services, and Microservice Governance Services.

Computation Governance Services(CGS) group is responsible for the core
process of job/request submission, preparation and execution, lifecycle
management, resource management, validation and orchestration.

Public enhancement Services(PES) group provides basic public functions
including job context sharing, material management and data source
management, to serve other Linkis services and upper application systems.

Microservice Governance Services(MGS) group includes customized Spring
Cloud Gateway, Eureka and Open Feign, to provide basic functions like
routing, service registration and discovery, and RPC framework.

By providing capabilities of multi-tenant, high concurrency, job
dispatching/management policies, unified resource control and
orchestration, Linkis makes the submission, preparation and execution of
computation jobs more flexible, reliable and controllable, and successfully
return the results. It could greatly reduce the overall development,
operation and maintenance costs, and the architecture complexity.

Based on Linkis the computation middleware, new upper layer applications
could be quickly developed by reusing the Linkis computation governance
functions, as what’s done in the open source big data platform suite
“WeDataSphere” (https://github.com/WeBankFinTech/WeDataSphere).

Linkis currently mainly supports OLAP and Streaming engines, and we are
planning to support OLTP engines better. Containerization is also one of
the important development directions of Linkis.




=Initial Goal=

- Migrate the existing codebase, website, and documentation to
Apache-hosted infrastructure.

- Work with the infrastructure team to implement and approve our code
review, build, and testing workflows in the context of the ASF.

- Incremental development and release under Apache guidelines.

- Grow and diversify the Linkis community in the Apache Way.




=Current Status=

==Meritocracy==

Linkis project was started at WeBank and has been an open-source project on
GitHub since July 2019. Linkis has been quickly adopted by many
organizations, more than 500 organizations have tested Linkis based on our
sandbox application records, dozens of them have introduced Linkis into
production based on the users’ spontaneous feedbacks, distributed in
various industries including banking, telecommunications, insurance,
manufacturing, education, internet, etc.

Linkis already has contributors and users from different companies. We’ve
set up the Committer team and we’re constantly seeking for potential new
committer. New Contributors are always highly welcomed and guided by
existed committers. Users could get timely support from community IM groups
and GitHub.




==Community==

Linkis now has 15 committers from 6 companies including WeBank, China
Telecom, Kanzhun Ltd., iQIYI Inc., HONOR Mobile Phone, and Samoyed Digital.
We have a developer IM group for more than 100 people from different
organizations, and 9 user IM groups for more than 4,500 people.




==Core Developers==

The core developers of Linkis are working in the big data team of different
companies, mainly in WeBank since the project was initiated there.

- Shuai Di (WeBank)

- Qiang Yin (WeBank)

- Heping Wang (WeBank)

- Yongkun Yang (WeBank)

- Zhiyue Yang (WeBank)

- You Liu (WeBank)

- XiaoGang Wang (China Telecom)

- Hui Zhu (Kanzhun)

- Zheng Wang (iQiyi)

- Rong Zhang (Honor)




==Releases==

Linkis has released multiple versions as listed here:
https://github.com/WeBankFinTech/Linkis/releases

We will follow the ASF guidelines more closely, and adopt the ASF source
release process upon joining the incubator.




==Code Reviews==

Linkis’s code reviews are currently public on Github:
https://github.com/WeBankFinTech/Linkis/pulls .




==Alignment==

As Linkis was built to address connectivity and other computation
governance issues with various underlying engines, it depends on multiple
ASF projects such as Spark, Flink, Hive and Hadoop. Linkis’s Engine
Connector Manager service will start different Engine Connectors to connect
to different underlying engines, providing computation governance abilities
which benefits the usage and maintenance of these engines. Linkis will
continue to expand the types of engines it supports in ASF projects, such
as HBase, Kylin, and more.




=Known Risks=

==Orphaned Products==

The risk of Linkis becoming an orphan product is very low, because it’s
already been the core infrastructure component in the production
environments of dozens of companies' big data platforms, including large
companies like WeBank, China Telecom, Ping An Insurance Company, Hikvision,
etc. Hundreds of thousands of computation jobs are performed through Linkis
in these companies everyday. Developers from these companies are
increasingly joining the Linkis community as contributors.

Linkis has 12 major releases so far, and received 355 PRs from
contributors, which indicates the activity and vitality of the Linkis
community. Linkis is also the core component of the open source big data
platform suite “WeDataSphere”, even more users and developers are already
active in this larger community.

We are looking forward to further expand and diversify the community by
joining Apache. We are also further improving the adherence to the
Community-Led development pattern, and the standardization and transparency
of community governance.




==Inexperience with Open Source==

Linkis’s core developers have been running Linkis as a community-oriented
open source project for a period of time, some of them already have
experience working with other open source communities. The current Linkis
user group scale of more than 4500 people is also a proof of our commitment
and passion for operating the open source community.

Meanwhile, we’ve begun to refine our community governance efforts under the
guidance of Apache mentors, and we’ll learn more about how to operate the
open source community effectively and properly by following the Apache way
in our incubator journey.




==Homogenous Developers==

Most of the current core developers work at WeBank where the Linkis project
started. We also had developers from China Telecom, Kanzhun, iQiyi and
Honor Mobile Phone elected to the committer group, and already have led the
release of several versions of Linkis. Samoyed Digital has the latest
nominated committer because of their solid contributions to Linkis data
source management module.

Though Linkis community may not be diverse enough yet, we are constantly
looking for new contributors and potential committers to enhance the
diversity of the community and the vitality of the project.




==An Excessive Fascination with the Apache Brand==

We acknowledge that the Apache brand would add a lot of value and
reputation to Linkis, and will benefit the cooperation and promotion at the
global scale. However, our primary purpose is to build a more diverse and
viable community and to gain stability for long-term development as
submitting Linkis to Apache. We will also strictly follow the ASF's rules
and policies under the guidance of the Incubator PMC.




=Documentation=

Documentation about Linkis can be found at
https://github.com/WeBankFinTech/Linkis-Doc . Following links provide more
information:

- Codebase at Github: https://github.com/WeBankFinTech/Linkis

- Issue Tracking: https://github.com/WeBankFinTech/Linkis/issues

- Releases: https://github.com/WeBankFinTech/Linkis/releases







=Initial Source=

https://github.com/WeBankFinTech/Linkis




=External Dependencies=




Back-end:

| Dependencies |
License
|
Comment
|
|
caffeine
|
Apache 2.0
|


|
| cglib | Apache 2.0 |
|
| commons-beanutils | Apache 2.0 |
|
| commons-codec | Apache 2.0 |
|
| commons-collections | Apache 2.0 |
|
| commons-dbcp | Apache 2.0 |
|
| commons-exec | Apache 2.0 |
|
| commons-io | Apache 2.0 |
|
| commons-lang3 | Apache 2.0 |
|
| commons-math3 | Apache 2.0 |
|
| commons-net | Apache 2.0 |
|
| commons-text | Apache 2.0 |
|
| dozer-core | Apache 2.0 |
|
| druid | Apache 2.0 |
|
| fastjson | Apache 2.0 |
|
| gson | Apache 2.0 |
|
| guava | Apache 2.0 |
|
| hadoop-auth | Apache 2.0 |
|
| hadoop-client | Apache 2.0 |
|
| hadoop-common | Apache 2.0 |
|
| hadoop-hdfs | Apache 2.0 |
|
| hadoop-yarn-client | Apache 2.0 |
|
| hive-common | Apache 2.0 |
|
| hive-exec | Apache 2.0 |
|
| hive-jdbc | Apache 2.0 |
|
| httpclient | Apache 2.0 |
|
| httpmime | Apache 2.0 |
|
| jackson-annotations | Apache 2.0 |
|
| jackson-databind | Apache 2.0 |
|
| jackson-module-scala | Apache 2.0 |
|
| javacsv | LGPL |
|
| jaxrs-ri | CDDL, GPL 1.1 | will remove |
| jersey-container-servlet | CDDL, GPL 1.1 | will remove |
| jersey-container-servlet-core | CDDL, GPL 1.1 | will remove |
| jersey-entity-filtering | CDDL, GPL 1.1 | will remove |
| jersey-json | CDDL, GPL 1.1 | will remove |
| jersey-media-json-jackson | CDDL, GPL 1.1 | will remove |
| jersey-media-multipart | CDDL, GPL 1.1 | will remove |
| jersey-server | CDDL, GPL 1.1 | will remove |
| jersey-servlet | CDDL, GPL 1.1 | will remove |
| jersey-spring3 | CDDL, GPL 1.1 | will remove |
| jetty-server | Apache 2.0, EPL 1.0 |
|
| jetty-webapp | Apache 2.0, EPL 1.0 |
|
| json4s-jackson | Apache 2.0 |
|
| jsp-api | CDDL, GPL 2.0 | will remove |
| junit | EPL 1.0 |
|
| libthrift | Apache 2.0 |
|
| log4j-1.2-api | Apache 2.0 |
|
| log4j-api | Apache 2.0 |
|
| log4j-core | Apache 2.0 |
|
| log4j-slf4j-impl | Apache 2.0 |
|
| mockito-all | MIT |
|
| mybatis-plus-boot-starter | Apache 2.0 |
|
| mysql-connector-java | GPL 2.0 | will remove |
| netty-all | Apache 2.0 |
|
| pagehelper | MIT |
|
| poi-ooxml | Apache 2.0 |
|
| protostuff-api | Apache 2.0 |
|
| protostuff-core | Apache 2.0 |
|
| protostuff-runtime | Apache 2.0 |
|
| py4j | BSD 2-clause |
|
| reactor-netty | Apache 2.0 |
|
| reflections | BSD 2-clause |
|
| scalacheck | BSD 3-clause |
|
| scalacheck-shapeless | Apache 2.0 |
|
| scala-compiler | Apache 2.0 |
|
| scala-library | Apache 2.0 |
|
| scalamock-scalatest-support | MIT |
|
| scalap | Apache 2.0 |
|
| scala-reflect | Apache 2.0 |
|
| scalatest | Apache 2.0 |
|
| slf4j-api | MIT |
|
| spark-core | Apache 2.0 |
|
| spark-hive | Apache 2.0 |
|
| spark-repl | Apache 2.0 |
|
| spark-sql | Apache 2.0 |
|
| spark-testing-base | Apache 2.0 |
|
| spoiwo | MIT |
|
| spring-boot | Apache 2.0 |
|
| spring-boot-actuator-autoconfigure | Apache 2.0 |
|
| spring-boot-starter | Apache 2.0 |
|
| spring-boot-starter-actuator | Apache 2.0 |
|
| spring-boot-starter-aop | Apache 2.0 |
|
| spring-boot-starter-cache | Apache 2.0 |
|
| spring-boot-starter-jetty | Apache 2.0 |
|
| spring-boot-starter-log4j2 | Apache 2.0 |
|
| spring-boot-starter-quartz | Apache 2.0 |
|
| spring-boot-starter-reactor-netty | Apache 2.0 |
|
| spring-boot-starter-web | Apache 2.0 |
|
| spring-cloud-commons | Apache 2.0 |
|
| spring-cloud-config-client | Apache 2.0 |
|
| spring-cloud-context | Apache 2.0 |
|
| spring-cloud-gateway-core | Apache 2.0 |
|
| spring-cloud-starter | Apache 2.0 |
|
| spring-cloud-starter-config | Apache 2.0 |
|
| spring-cloud-starter-gateway | Apache 2.0 |
|
| spring-cloud-starter-netflix-eureka-client | Apache 2.0 |
|
| spring-cloud-starter-netflix-eureka-server | Apache 2.0 |
|
| spring-cloud-starter-openfeign | Apache 2.0 |
|
| spring-core | Apache 2.0 |
|
| spring-jdbc | Apache 2.0 |
|
| spring-security-crypto | Apache 2.0 |
|
| spring-test | Apache 2.0 |
|
| spring-tx | Apache 2.0 |
|
| spring-web | Apache 2.0 |
|
| websocket-client | Apache 2.0, EPL 1.0 |
|
| websocket-server | Apache 2.0, EPL 1.0 |
|
| xlsx-streamer | Apache 2.0 |
|
| xstream | BSD 3-clause |
|




Front-end:

|
Dependencies
|
License
|
Comment
|
|
axios
|
MIT
|


|
|
highlight.js
|
BSD-3-Clause
|


|
|
iview
|
MIT
|


|
|
lodash
|
MIT
|


|
|
moment
|
MIT
|


|
|
monaco-editor
|
MIT
|


|
|
sql-formatter
|
MIT
|


|
| svgo   | MIT |
|
| vue  | MIT |
|
| vue-i18n | MIT |
|
| vue-router   | MIT |
|
| vuedraggable       | MIT |
|
| vuescroll      | MIT |
|




=Required Resources=




==Mailing List==

Currently Linkis has no mailing list. The usual mailing lists are expected
to be set up when entering incubation:

- private@linkis.incubator.apache.orgfor PPMC discussions;

- d...@linkis.incubator.apache.org for development discussions;

- notificat...@linkis.incubator.apache.org for user notifications, and
notifications from GitHub.




==Git Repositories==

Upon entering incubation, we request to move the existing repository from
https://github.com/WeBankFinTech/Linkis to Apache infrastructure like
https://github.com/apache/Incubator-Linkis.




==Issue Tracking==

The Linkis community would like to continue using GitHub Issues if possible.




==Other Resources==

Apache Jenkins




=Source and Intellectual Property Submission Plan=

Most of the current code is Apache 2.0 licensed and the copyright is
assigned to WeBank. If the project enters incubator, WeBank will transfer
the source code & trademark ownership to ASF via a Software Grant Agreement.




=Initial Committers=

- Shuai Di (shuaidi1...@163.com)

- Qiang Yin (690574...@qq.com)

- Heping Wang (374126...@qq.com)

- Yongkun Yang (wimkun...@gmail.com)

- Zhiyue Yang (904666...@qq.com)

- You Liu (405240...@qq.com)

- Deyi Hua (david_hua1...@hotmail.com)

- Le Bai (120190...@qq.com)

- Xiaogang Wang (913546...@qq.com)

- Hui Zhu (46580...@qq.com)

- Zhen Wang (643348...@qq.com)

- Rong Zhang (693404...@qq.com)

- Xiaohua Yi (405078...@qq.com)

- Ke Zhou (zhouke...@vip.qq.com)

- Jian Xie (xj...@163.com)




=Affiliations=

Shuai Di, Qiang Yin, Heping Wang, Yongkun Yang, Zhiyue Yang, You Liu, Deyi
Hua, Le Bai, Ke Zhou and Jian Xie of the initial committers are employees
of WeBank.

Xiaogang Wang of the initial committers is an employee of China Telecom.

Hui Zhu of the initial committers is an employee of Kanzhun.

Zhen Wang of the initial committers is an employee of iQiyi.

Rong Zhang of the initial committers is an employee of HONOR Mobile Phone.

Xiaohua Yi of the initial committers is an employee of Samoyed Digital.




=Sponsors=

==Champion==

Junping_Du (ASF Member, IPMC Member), junping...@apache.org




==Nominated Mentors==

Shao Feng Shi (ASF Member, IPMC Member), shaofeng...@apache.org

Duo Zhang (ASF Member, IPMC Member), zhang...@apache.org

Jerry Shao (ASF Member, IPMC Member), js...@apache.org

Lidong Dai (IPMC Member), lidong...@apache.org




=Sponsoring Entity=

We request the Apache Incubator to sponsor this project.

======

Best Regards,

Shuai Di

Reply via email to