Re: [VOTE] Accept Druid into the Apache Incubator

Jyotirmoy Sundi Fri, 23 Feb 2018 01:03:07 -0800
+1 Vote

On 2018/02/22 19:03:55, Julian Hyde <j...@apache.org> wrote: 
> Hi all,> 
> 
> After some discussion on the Druid proposal[1], I'd like to> 
> start a vote on accepting Druid into the Apache Incubator,> 
> per the ASF policy[2] and voting rules[3].> 
> 
> A vote for accepting a new Apache Incubator podling is a> 
> majority vote for which only Incubator PMC member votes are> 
> binding. Votes from other people are also welcome as an> 
> indication of people's enthusiasm (or lack thereof).> 
> 
> Please do not use this VOTE thread for discussions.  If> 
> needed, start a new thread instead.> 
> 
> This vote will run for at least 72 hours. Please VOTE as> 
> follows:> 
>  [ ] +1 Accept Druid into the Apache Incubator> 
>  [ ] +0 Abstain> 
>  [ ] -1 Do not accept Druid into the Apache Incubator> 
>         because ...> 
> 
> The proposal is listed below, but you can also access it on> 
> the wiki[4].> 
> 
> Julian> 
> 
> [1] 
> https://lists.apache.org/thread.html/b95f90a30b6e8587e9b108f368b07c1b3e23e25ca592448d9c9f81e2@%3Cgeneral.incubator.apache.org%3E>
>  
> 
> [2] 
> https://incubator.apache.org/policy/incubation.html#approval_of_proposal_by_sponsor>
>  
> 
> [3] http://www.apache.org/foundation/voting.html> 
> 
> [4] https://wiki.apache.org/incubator/DruidProposal> 
> 
> 
> 
> 
> 
> = Druid Proposal => 
> 
> == Abstract ==> 
> 
> Druid is a high-performance, column-oriented, distributed> 
> data store.> 
> 
> == Proposal ==> 
> 
> Druid is an open source data store designed for real-time> 
> exploratory analytics on large data sets. Druid's key> 
> features are a column-oriented storage layout, a distributed> 
> shared-nothing architecture, and ability to generate and> 
> leverage indexing and caching structures. Druid is typically> 
> deployed in clusters of tens to hundreds of nodes, and has> 
> the ability to load data from Apache Kafka and Apache> 
> Hadoop, among other data sources. Druid offers two query> 
> languages: a SQL dialect (powered by Apache Calcite) and a> 
> JSON-over-HTTP API.> 
> 
> Druid was originally developed to power a slice-and-dice> 
> analytical UI built on top of large event streams. The> 
> original use case for Druid targeted ingest rates of> 
> millions of records/sec, retention of over a year of data,> 
> and query latencies of sub-second to a few seconds. Many> 
> people can benefit from such capability, and many already> 
> have (see http://druid.io/druid-powered.html). In addition,> 
> new use cases have emerged since Druid's original> 
> development, such as OLAP acceleration of data warehouse> 
> tables and more highly concurrent applications operating> 
> with relatively narrower queries.> 
> 
> == Background ==> 
> 
> Druid is a data store designed for fast analytics. It would> 
> typically be used in lieu of more general purpose query> 
> systems like Hadoop MapReduce or Spark when query latency is> 
> of the utmost importance. Druid is often used as a data> 
> store for powering GUI analytical applications.> 
> 
> The buzzwordy description of Druid is a high-performance,> 
> column-oriented, distributed data store. What we mean by> 
> this is:> 
> 
> * "high performance": Druid aims to provide low query> 
>   latency and high ingest rates possible.> 
> * "column-oriented": Druid stores data in a column-oriented> 
>   format, like most other systems designed for analytics. It> 
>   can also store indexes along with the columns.> 
> * "distributed": Druid is deployed in clusters, typically of> 
>   tens to hundreds of nodes.> 
> * "data store": Druid loads your data and stores a copy of> 
>   it on the cluster's local disks (and may cache it in> 
>   memory). It doesn't query your data from some other> 
>   storage system.> 
> 
> == Rationale ==> 
> 
> Druid is a mature, active project with a large number of> 
> production installations, dozens of contributors to each> 
> release, and multiple vendors offering professional> 
> support. Given Druid's strong community, its close> 
> integration with many other Apache projects (such as Kafka,> 
> Hadoop, and Calcite), and its pre-existing Apache-inspired> 
> governance structure, we feel that Apache is the best home> 
> for the project on a long-term basis.> 
> 
> == Current Status ==> 
> 
> === Meritocracy ===> 
> 
> Since Druid was first open sourced the original developers> 
> have solicited contributions from others, including through> 
> our blog, the project mailing lists, and through accepting> 
> GitHub pull requests. We have an Apache-inspired governance> 
> structure with a PMC and committers, and our committer ranks> 
> include a good number of people from outside the original> 
> development team.> 
> 
> === Community ===> 
> 
> The Druid core developers have sought to nurture a community> 
> throughout the life of the project. We use GitHub as the> 
> focal point for bug reports and code contributions, and the> 
> mailing lists for most other discussion. To try to make> 
> people feel welcome, we've also spelled this out on a> 
> "CONTRIBUTE" link from the project page:> 
> http://druid.io/community/. Today we have an active> 
> contributor base (a typical release has ~40 contributors)> 
> and mailing list.> 
> 
> === Core Developers ===> 
> 
> Druid enjoys good diversity of committer affiliation. The> 
> most active developers over the past year are affiliated> 
> with four different companies: Imply, Metamarkets, Yahoo,> 
> and Hortonworks. Many Druid committers are also committers> 
> on other ASF projects as well, including Apache Airflow,> 
> Apache Curator, and Apache Calcite. The original developers> 
> of Druid remain involved in the project.> 
> 
> === Alignment ===> 
> 
> Druid's current governance structure is Apache-inspired with> 
> a PMC and committers chosen by a meritocratic> 
> process. Additionally, Druid integrates with a number of> 
> other Apache projects, including Kafka, Hadoop, Hive,> 
> Calcite, Superset (incubating), Spark, Curator, and> 
> ZooKeeper.> 
> 
> == Known Risks ==> 
> 
> === Orphaned products ===> 
> 
> The risk of Druid becoming orphaned is low, due to a diverse> 
> committer base that is invested in the future of the> 
> project.> 
> 
> === Inexperience with Open Source ===> 
> 
> Druid's core developers have been running it as a> 
> community-oriented open source project for some time now,> 
> and many of them are committers on other open source> 
> projects as well, including Apache Airflow, Apache Curator,> 
> and Apache Calcite.> 
> 
> === Homogenous Developers ===> 
> 
> Druid's current diversity of committer affiliation means> 
> that we have become accustomed to working collaboratively> 
> and in the open. We hope that a transition to the ASF helps> 
> Druid's contributor base become even more diverse.> 
> 
> === Reliance on Salaried Developers ===> 
> 
> Druid's user base and contributor base skews heavily towards> 
> salaried developers. We believe this is natural since Druid> 
> is a technology designed to be deployed on large clusters,> 
> and due to this, tends to be deployed by organizations> 
> rather than by individuals. Nevertheless, many current Druid> 
> developers have continued working on the project even> 
> through job changes, which we take to be a good sign of> 
> developer commitment and personal interest.> 
> 
> === Relationships with Other Apache Products ===> 
> 
> Druid integrates with a number of other Apache> 
> projects. Druid internally uses Calcite for SQL planning,> 
> and Curator and ZooKeeper for coordination.  Druid can read> 
> data in Avro or Parquet format. Druid can load data from> 
> streams in Kafka or from files in Hadoop. Druid integrates> 
> with Hive as an option for SQL query acceleration. Druid> 
> data can be visualized by Superset (incubating).> 
> 
> === A Excessive Fascination with the Apache Brand ===> 
> 
> Druid is a successful project with a diverse community. The> 
> main reason for pursuing incubation is to find a stable,> 
> long term home for the project with a well known governance> 
> philosophy.> 
> 
> == Required Resources ==> 
> 
> === Mailing lists ===> 
> 
> We would like to migrate the existing Druid mailing lists> 
> from Google Groups to Apache.> 
> 
> * druid-user@googlegroups -> us...@druid.incubator.apache.org> 
> * druid-development@googlegroups -> d...@druid.incubator.apache.org> 
> 
> === Source control ===> 
> 
> Druid development currently takes place on GitHub. We would> 
> like to continue using GitHub, if possible, in order to> 
> preserve the workflows the community has developed around> 
> GitHub pull requests.> 
> 
> === Issue tracking ===> 
> 
> Druid currently uses GitHub issues for issue tracking. We> 
> would like to migrate to Apache JIRA at> 
> http://issues.apache.org/jira/browse/DRUID.> 
> 
> == Documentation ==> 
> 
> Druid's documentation can be found at> 
> http://druid.io/docs/latest/.> 
> 
> == Initial Source ==> 
> 
> Druid was initially open-sourced by Metamarkets in 2012 and> 
> has been run in a community-governed fashion since then. The> 
> code is currently hosted at https://github.com/druid-io/ and> 
> includes the following repositories:> 
> 
> * druid (primary repository)> 
> * druid-console (web console for Druid)> 
> * druid-io.github.io (source for Druid's website at> 
>   http://druid.io/)> 
> * tranquility (realtime stream push client for Druid)> 
> * docker-druid (Docker image for Druid)> 
> * pydruid (Python library)> 
> * RDruid (R library)> 
> * oss-parent (Maven POM files)> 
> 
> == Source and Intellectual Property Submission Plan ==> 
> 
> A complete set of the open source code needs to be licensed> 
> from the owning organization to the Foundation. Commercial> 
> legal counsel for the owning organization will review the> 
> standard Foundation licensing paperwork and propose any> 
> updates as needed. This license will enable Apache to> 
> incubate and manage the Druid project moving forward.> 
> 
> Other Druid paraphernalia to be transferred to Apache> 
> consists of:> 
> 
> * GitHub organization at https://github.com/druid-io/> 
> * Twitter account at https://twitter.com/druidio> 
> * "druid.io" domain name> 
> * "Druid" trademark assignment per Foundation standard> 
>   paper. The trademark assignment paperwork shall be> 
>   reviewed by the owning organization's commercial and IP> 
>   counsel> 
> * CLAs - all rights in the code licensed above should> 
>   encompass the CLAs that existed between developers and> 
>   owning organization> 
> 
> A copyright license to the code, trademark assignment of> 
> Druid, and transfer of other paraphernalia to Apache should> 
> be sufficient to cover all rights required by Apache to> 
> operate the project.> 
> 
> == External Dependencies ==> 
> 
> External dependencies distributed with Druid currently all> 
> have one of the following Category A or B licenses: ASL,> 
> BSD, CDDL, EPL, MIT, MPL; with one exception: the optional> 
> Druid MySQL metadata store extension depends on MySQL> 
> Connector/J, which is GPL licensed. Druid currently packages> 
> this as a separate download; see our current presentation> 
> on: http://druid.io/downloads.html. As part of incubation we> 
> intend to determine the best strategy for handling the MySQL> 
> extension.> 
> 
> == Cryptography ==> 
> 
> Not applicable.> 
> 
> == Initial Committers ==> 
> 
> The initial committers for incubation are the current set of> 
> committers on Druid who have expressed interest in being> 
> involved in Apache incubation.  Affiliations are listed> 
> where relevant. We may seek to add other committers during> 
> incubation; for example, we would want to add any current> 
> Druid committers who express an interest after incubation> 
> begins.> 
> 
> * Charles Allen (char...@allen-net.com) (Snap)> 
> * David Lim (david.clarence....@gmail.com) (Imply)> 
> * Eric Tschetter (ched...@apache.org) (Splunk)> 
> * Fangjin Yang (f...@imply.io) (Imply)> 
> * Gian Merlino (g...@apache.org) (Imply)> 
> * Himanshu Gupta (g.himan...@gmail.com) (Oath)> 
> * Jihoon Son (jihoon...@apache.org) (Imply)> 
> * Jonathan Wei (jon....@imply.io) (Imply)> 
> * Maxime Beauchemin (maximebeauche...@gmail.com) (Lyft)> 
> * Mohamed Slim Bouguerra (slim.bougue...@gmail.com) (Hortonworks)> 
> * Nishant Bangarwa (nish...@apache.org) (Hortonworks)> 
> * Parag Jain (paragjai...@gmail.com) (Oath)> 
> * Roman Leventov (leventov...@gmail.com) (Metamarkets)> 
> * Xavier Léauté (xav...@leaute.com) (Confluent)> 
> 
> == Sponsors ==> 
> 
> * Champion: Julian Hyde> 
> * Nominated mentors: Julian Hyde, P. Taylor Goetz, Jun Rao> 
> * Sponsoring entity: Apache Incubator> 
> 
> ---------------------------------------------------------------------> 
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org> 
> For additional commands, e-mail: general-h...@incubator.apache.org> 
> 
> 
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept Druid into the Apache Incubator

Reply via email to