+1 (non-binding).

Thanks,
Henry

On 27 November 2015 at 07:14, Andrew Bayer <andrew.ba...@gmail.com> wrote:

> +1 binding
>
> On Thursday, November 26, 2015, Ted Dunning <ted.dunn...@gmail.com> wrote:
>
> > +1 (binding)
> >
> > I think that forcing experienced community developers into one model or
> the
> > other is unnecessary. Let them in as they would like.
> >
> >
> >
> > On Wed, Nov 25, 2015 at 4:51 PM, Greg Stein <gst...@gmail.com
> > <javascript:;>> wrote:
> >
> > > -1 (binding)
> > >
> > > Starting with RTC is a poor way to attract new community members. I'd
> > like
> > > to see this community use CTR instead of mandating gerrit reviews.
> > >
> > > (ref: other-threads about lack of trust, and control issues; poor basis
> > for
> > > a community)
> > >
> > > On Tue, Nov 24, 2015 at 1:32 PM, Todd Lipcon <t...@apache.org
> > <javascript:;>> wrote:
> > >
> > > > Hi all,
> > > >
> > > > Discussion on the [DISCUSS] thread seems to have wound down, so I'd
> > like
> > > to
> > > > call a VOTE on acceptance of Kudu into the ASF Incubator. The
> proposal
> > is
> > > > pasted below and also available on the wiki at:
> > > > https://wiki.apache.org/incubator/KuduProposal
> > > >
> > > > The proposal is unchanged since the original version, except for the
> > > > addition of Carl Steinbach as a Mentor.
> > > >
> > > > Please cast your votes:
> > > >
> > > > [] +1, accept Kudu into the Incubator
> > > > [] +/-0, positive/negative non-counted expression of feelings
> > > > [] -1, do not accept Kudu into the incubator (please state reasoning)
> > > >
> > > > Given the US holiday this week, I imagine many folks are traveling or
> > > > otherwise offline. So, let's run the vote for a full week rather than
> > the
> > > > traditional 72 hours. Unless the IPMC objects to the extended voting
> > > > period, the vote will close on Tues, Dec 1st at noon PST.
> > > >
> > > > Thanks
> > > > -Todd
> > > > -----
> > > >
> > > > = Kudu Proposal =
> > > >
> > > > == Abstract ==
> > > >
> > > > Kudu is a distributed columnar storage engine built for the Apache
> > Hadoop
> > > > ecosystem.
> > > >
> > > > == Proposal ==
> > > >
> > > > Kudu is an open source storage engine for structured data which
> > supports
> > > > low-latency random access together with efficient analytical access
> > > > patterns. Kudu distributes data using horizontal partitioning and
> > > > replicates each partition using Raft consensus, providing low
> > > > mean-time-to-recovery and low tail latencies. Kudu is designed within
> > the
> > > > context of the Apache Hadoop ecosystem and supports many integrations
> > > with
> > > > other data analytics projects both inside and outside of the Apache
> > > > Software Foundation.
> > > >
> > > >
> > > >
> > > > We propose to incubate Kudu as a project of the Apache Software
> > > Foundation.
> > > >
> > > > == Background ==
> > > >
> > > > In recent years, explosive growth in the amount of data being
> generated
> > > and
> > > > captured by enterprises has resulted in the rapid adoption of open
> > source
> > > > technology which is able to store massive data sets at scale and at
> low
> > > > cost. In particular, the Apache Hadoop ecosystem has become a focal
> > point
> > > > for such “big data” workloads, because many traditional open source
> > > > database systems have lagged in offering a scalable alternative.
> > > >
> > > >
> > > >
> > > > Structured storage in the Hadoop ecosystem has typically been
> achieved
> > in
> > > > two ways: for static data sets, data is typically stored on Apache
> HDFS
> > > > using binary data formats such as Apache Avro or Apache Parquet.
> > However,
> > > > neither HDFS nor these formats has any provision for updating
> > individual
> > > > records, or for efficient random access. Mutable data sets are
> > typically
> > > > stored in semi-structured stores such as Apache HBase or Apache
> > > Cassandra.
> > > > These systems allow for low-latency record-level reads and writes,
> but
> > > lag
> > > > far behind the static file formats in terms of sequential read
> > throughput
> > > > for applications such as SQL-based analytics or machine learning.
> > > >
> > > >
> > > >
> > > > Kudu is a new storage system designed and implemented from the ground
> > up
> > > to
> > > > fill this gap between high-throughput sequential-access storage
> systems
> > > > such as HDFS and low-latency random-access systems such as HBase or
> > > > Cassandra. While these existing systems continue to hold advantages
> in
> > > some
> > > > situations, Kudu offers a “happy medium” alternative that can
> > > dramatically
> > > > simplify the architecture of many common workloads. In particular,
> Kudu
> > > > offers a simple API for row-level inserts, updates, and deletes,
> while
> > > > providing table scans at throughputs similar to Parquet, a
> > commonly-used
> > > > columnar format for static data.
> > > >
> > > >
> > > >
> > > > More information on Kudu can be found at the existing open source
> > project
> > > > website: http://getkudu.io and in particular in the Kudu white-paper
> > > PDF:
> > > > http://getkudu.io/kudu.pdf from which the above was excerpted.
> > > >
> > > > == Rationale ==
> > > >
> > > > As described above, Kudu fills an important gap in the open source
> > > storage
> > > > ecosystem. After our initial open source project release in September
> > > 2015,
> > > > we have seen a great amount of interest across a diverse set of users
> > and
> > > > companies. We believe that, as a storage system, it is critical to
> > build
> > > an
> > > > equally diverse set of contributors in the development community. Our
> > > > experiences as committers and PMC members on other Apache projects
> have
> > > > taught us the value of diverse communities in ensuring both longevity
> > and
> > > > high quality for such foundational systems.
> > > >
> > > > == Initial Goals ==
> > > >
> > > >  * Move the existing codebase, website, documentation, and mailing
> > lists
> > > to
> > > > Apache-hosted infrastructure
> > > >  * Work with the infrastructure team to implement and approve our
> code
> > > > review, build, and testing workflows in the context of the ASF
> > > >  * Incremental development and releases per Apache guidelines
> > > >
> > > > == Current Status ==
> > > >
> > > > ==== Releases ====
> > > >
> > > > Kudu has undergone one public release, tagged here
> > > > https://github.com/cloudera/kudu/tree/kudu0.5.0-release
> > > >
> > > > This initial release was not performed in the typical ASF fashion --
> no
> > > > source tarball was released, but rather only convenience binaries
> made
> > > > available in Cloudera’s repositories. We will adopt the ASF source
> > > release
> > > > process upon joining the incubator.
> > > >
> > > >
> > > > ==== Source ====
> > > >
> > > > Kudu’s source is currently hosted on GitHub at
> > > > https://github.com/cloudera/kudu
> > > >
> > > > This repository will be transitioned to Apache’s git hosting during
> > > > incubation.
> > > >
> > > >
> > > >
> > > > ==== Code review ====
> > > >
> > > > Kudu’s code reviews are currently public and hosted on Gerrit at
> > > > http://gerrit.cloudera.org:8080/#/q/status:open+project:kudu
> > > >
> > > > The Kudu developer community is very happy with gerrit and hopes to
> > work
> > > > with the Apache Infrastructure team to figure out how we can continue
> > to
> > > > use Gerrit within ASF policies.
> > > >
> > > >
> > > >
> > > > ==== Issue tracking ====
> > > >
> > > > Kudu’s bug and feature tracking is hosted on JIRA at:
> > > > https://issues.cloudera.org/projects/KUDU/summary
> > > >
> > > > This JIRA instance contains bugs and development discussion dating
> > back 2
> > > > years prior to Kudu’s open source release and will provide an initial
> > > seed
> > > > for the ASF JIRA.
> > > >
> > > >
> > > >
> > > > ==== Community discussion ====
> > > >
> > > > Kudu has several public discussion forums, linked here:
> > > > http://getkudu.io/community.html
> > > >
> > > >
> > > >
> > > > ==== Build Infrastructure ====
> > > >
> > > > The Kudu Gerrit instance is configured to only allow patches to be
> > > > committed after running them through an extensive set of pre-commit
> > tests
> > > > and code lints. The project currently makes use of elastic public
> cloud
> > > > resources to perform these tests. Until this point, these resources
> > have
> > > > been internal to Cloudera, though we are currently investing in
> moving
> > > to a
> > > > publicly accessible infrastructure.
> > > >
> > > >
> > > >
> > > > ==== Development practices ====
> > > >
> > > > Given that Kudu is a persistent storage engine, the community has a
> > high
> > > > quality bar for contributions to its core. We have a firm belief that
> > > high
> > > > quality is achieved through automation, not manual inspection, and
> > hence
> > > > put a focus on thorough testing and build infrastructure to ensure
> that
> > > > bar. The development community also practices review-then-commit for
> > all
> > > > changes to ensure that changes are accompanied by appropriate tests,
> > are
> > > > well commented, etc.
> > > >
> > > > Rather than seeing these practices as barriers to contribution, we
> > > believe
> > > > that a fully automated and standardized review and testing practice
> > makes
> > > > it easier for new contributors to have patches accepted. Any new
> > > developer
> > > > may post a patch to Gerrit using the same workflow as a seasoned
> > > > contributor, and the same suite of tests will be automatically run.
> If
> > > the
> > > > tests pass, a committer can quickly review and commit the
> contribution
> > > from
> > > > their web browser.
> > > >
> > > > === Meritocracy ===
> > > >
> > > > We believe strongly in meritocracy in electing committers and PMC
> > > members.
> > > > We believe that contributions can come in forms other than just code:
> > for
> > > > example, one of our initial proposed committers has contributed
> solely
> > in
> > > > the area of project documentation. We will encourage contributions
> and
> > > > participation of all types, and ensure that contributors are
> > > appropriately
> > > > recognized.
> > > >
> > > > === Community ===
> > > >
> > > > Though Kudu is relatively new as an open source project, it has
> already
> > > > seen promising growth in its community across several organizations:
> > > >
> > > >  * '''Cloudera''' is the original development sponsor for Kudu.
> > > >  * '''Xiaomi''' has been helping to develop and optimize Kudu for a
> new
> > > > production use case, contributing code, benchmarks, feedback, and
> > > > conference talks.
> > > >  * '''Intel''' has contributed optimizations related to their
> hardware
> > > > technologies.
> > > >  * '''Dropbox''' has been experimenting with Kudu for a machine
> > > monitoring
> > > > use case, and has been contributing bug reports and product feedback.
> > > >  * '''Dremio''' is working on integration with Apache Drill and
> > exploring
> > > > using Kudu in a production use case.
> > > >  * Several community-built Docker images, tutorials, and blog posts
> > have
> > > > sprouted up since Kudu’s release.
> > > >
> > > >
> > > >
> > > > By bringing Kudu to Apache, we hope to encourage further contribution
> > > from
> > > > the above organizations as well as to engage new users and
> contributors
> > > in
> > > > the community.
> > > >
> > > > === Core Developers ===
> > > >
> > > > Kudu was initially developed as a project at Cloudera. Most of the
> > > > contributions to date have been by developers employed by Cloudera.
> > > >
> > > >
> > > >
> > > > Many of the developers are committers or PMC members on other Apache
> > > > projects.
> > > >
> > > > === Alignment ===
> > > >
> > > > As a project in the big data ecosystem, Kudu is aligned with several
> > > other
> > > > ASF projects. Kudu includes input/output format integration with
> Apache
> > > > Hadoop, and this integration can also provide a bridge to Apache
> Spark.
> > > We
> > > > are planning to integrate with Apache Hive in the near future. We
> also
> > > > integrate closely with Cloudera Impala, which is also currently being
> > > > proposed for incubation. We have also scheduled a hackathon with the
> > > Apache
> > > > Drill team to work on integration with that query engine.
> > > >
> > > > == Known Risks ==
> > > >
> > > > === Orphaned Products ===
> > > >
> > > > The risk of Kudu being abandoned is low. Cloudera has invested a
> great
> > > deal
> > > > in the initial development of the project, and intends to grow its
> > > > investment over time as Kudu becomes a product adopted by its
> customer
> > > > base. Several other organizations are also experimenting with Kudu
> for
> > > > production use cases which would live for many years.
> > > >
> > > > === Inexperience with Open Source ===
> > > >
> > > > Kudu has been released in the open for less than two months. However,
> > > from
> > > > our very first public announcement we have been committed to
> > open-source
> > > > style development:
> > > >
> > > >  * our code reviews are fully public and documented on a mailing list
> > > >  * our daily development chatter is in a public chat room
> > > >  * we send out weekly “community status” reports highlighting news
> and
> > > > contributions
> > > >  * we published our entire JIRA history and discuss bugs in the open
> > > >  * we published our entire Git commit history, going back three years
> > (no
> > > > squashing)
> > > >
> > > >
> > > >
> > > > Several of the initial committers are experienced open source
> > developers,
> > > > several being committers and/or PMC members on other ASF projects
> > > (Hadoop,
> > > > HBase, Thrift, Flume, et al). Those who are not ASF committers have
> > > > experience on non-ASF open source projects (Kiji, open-vm-tools, et
> > al).
> > > >
> > > > === Homogenous Developers ===
> > > >
> > > > The initial committers are employees or former employees of Cloudera.
> > > > However, the committers are spread across multiple offices (Palo
> Alto,
> > > San
> > > > Francisco, Melbourne), so the team is familiar with working in a
> > > > distributed environment across varied time zones.
> > > >
> > > >
> > > >
> > > > The project has received some contributions from developers outside
> of
> > > > Cloudera, and is starting to attract a ''user'' community as well. We
> > > hope
> > > > to continue to encourage contributions from these developers and
> > > community
> > > > members and grow them into committers after they have had time to
> > > continue
> > > > their contributions.
> > > >
> > > > === Reliance on Salaried Developers ===
> > > >
> > > > As mentioned above, the majority of development up to this point has
> > been
> > > > sponsored by Cloudera. We have seen several community users
> participate
> > > in
> > > > discussions who are hobbyists interested in distributed systems and
> > > > databases, and hope that they will continue their participation in
> the
> > > > project going forward.
> > > >
> > > > === Relationships with Other Apache Products ===
> > > >
> > > > Kudu is currently related to the following other Apache projects:
> > > >
> > > >  * Hadoop: Kudu provides MapReduce input/output formats for
> integration
> > > >  * Spark: Kudu integrates with Spark via the above-mentioned input
> > > formats,
> > > > and work is progressing on support for Spark Data Frames and Spark
> SQL.
> > > >
> > > >
> > > >
> > > > The Kudu team has reached out to several other Apache projects to
> start
> > > > discussing integrations, including Flume, Kafka, Hive, and Drill.
> > > >
> > > >
> > > >
> > > > Kudu integrates with Impala, which is also being proposed for
> > incubation.
> > > >
> > > >
> > > >
> > > > Kudu is already collaborating on ValueVector, a proposed TLP spinning
> > out
> > > > from the Apache Drill community.
> > > >
> > > >
> > > >
> > > > We look forward to continuing to integrate and collaborate with these
> > > > communities.
> > > >
> > > > === An Excessive Fascination with the Apache Brand ===
> > > >
> > > > Many of the initial committers are already experienced Apache
> > committers,
> > > > and understand the true value provided by the Apache Way and the
> > > principles
> > > > of the ASF. We believe that this development and contribution model
> is
> > > > especially appropriate for storage products, where Apache’s
> > > > community-over-code philosophy ensures long term viability and
> > > > consensus-based participation.
> > > >
> > > > == Documentation ==
> > > >
> > > >  * Documentation is written in AsciiDoc and committed in the Kudu
> > source
> > > > repository:
> > > >
> > > >  * https://github.com/cloudera/kudu/tree/master/docs
> > > >
> > > >
> > > >
> > > >  * The Kudu web site is version-controlled on the ‘gh-pages’ branch
> of
> > > the
> > > > above repository.
> > > >
> > > >  * A LaTeX whitepaper is also published, and the source is available
> > > within
> > > > the same repository.
> > > >  * APIs are documented within the source code as JavaDoc or C++-style
> > > > documentation comments.
> > > >  * Many design documents are stored within the source code repository
> > as
> > > > text files next to the code being documented.
> > > >
> > > > == Source and Intellectual Property Submission Plan ==
> > > >
> > > > The Kudu codebase and web site is currently hosted on GitHub and will
> > be
> > > > transitioned to the ASF repositories during incubation. Kudu is
> already
> > > > licensed under the Apache 2.0 license.
> > > >
> > > >
> > > >
> > > > Some portions of the code are imported from other open source
> projects
> > > > under the Apache 2.0, BSD, or MIT licenses, with copyrights held by
> > > authors
> > > > other than the initial committers. These copyright notices are
> > maintained
> > > > in those files as well as a top-level NOTICE.txt file. We believe
> this
> > to
> > > > be permissible under the license terms and ASF policies, and
> confirmed
> > > via
> > > > a recent thread on general@incubator.apache.org <javascript:;> .
> > > >
> > > >
> > > >
> > > > The “Kudu” name is not a registered trademark, though before the
> > initial
> > > > release of the project, we performed a trademark search and
> Cloudera’s
> > > > legal counsel deemed it acceptable in the context of a data storage
> > > engine.
> > > > There exists an unrelated open source project by the same name
> related
> > to
> > > > deployments on Microsoft’s Azure cloud service. We have been in
> contact
> > > > with legal counsel from Microsoft and have obtained their approval
> for
> > > the
> > > > use of the Kudu name.
> > > >
> > > >
> > > >
> > > > Cloudera currently owns several domain names related to Kudu (
> > getkudu.io
> > > ,
> > > > kududb.io, et al) which will be transferred to the ASF and
> redirected
> > to
> > > > the official page during incubation.
> > > >
> > > >
> > > >
> > > > Portions of Kudu are protected by pending or published patents owned
> by
> > > > Cloudera. Given the protections already granted by the Apache
> License,
> > we
> > > > do not anticipate any explicit licensing or transfer of this
> > intellectual
> > > > property.
> > > >
> > > > == External Dependencies ==
> > > >
> > > > The full set of dependencies and licenses are listed in
> > > > https://github.com/cloudera/kudu/blob/master/LICENSE.txt
> > > >
> > > > and summarized here:
> > > >
> > > >  * '''Twitter Bootstrap''': Apache 2.0
> > > >  * '''d3''': BSD 3-clause
> > > >  * '''epoch JS library''': MIT
> > > >  * '''lz4''': BSD 2-clause
> > > >  * '''gflags''': BSD 3-clause
> > > >  * '''glog''': BSD 3-clause
> > > >  * '''gperftools''': BSD 3-clause
> > > >  * '''libev''': BSD 2-clause
> > > >  * '''squeasel''':MIT license
> > > >  * '''protobuf''': BSD 3-clause
> > > >  * '''rapidjson''': MIT
> > > >  * '''snappy''': BSD 3-clause
> > > >  * '''trace-viewer''': BSD 3-clause
> > > >  * '''zlib''': zlib license
> > > >  * '''llvm''': University of Illinois/NCSA Open Source (BSD-alike)
> > > >  * '''bitshuffle''': MIT
> > > >  * '''boost''': Boost license
> > > >  * '''curl''': MIT
> > > >  * '''libunwind''': MIT
> > > >  * '''nvml''': BSD 3-clause
> > > >  * '''cyrus-sasl''': Cyrus SASL license (BSD-alike)
> > > >  * '''openssl''': OpenSSL License (BSD-alike)
> > > >
> > > >  * '''Guava''': Apache 2.0
> > > >  * '''StumbleUpon Async''': BSD
> > > >  * '''Apache Hadoop''': Apache 2.0
> > > >  * '''Apache log4j''': Apache 2.0
> > > >  * '''Netty''': Apache 2.0
> > > >  * '''slf4j''': MIT
> > > >  * '''Apache Commons''': Apache 2.0
> > > >  * '''murmur''': Apache 2.0
> > > >
> > > >
> > > > '''Build/test-only dependencies''':
> > > >
> > > >  * '''CMake''': BSD 3-clause
> > > >  * '''gcovr''': BSD 3-clause
> > > >  * '''gmock''': BSD 3-clause
> > > >  * '''Apache Maven''': Apache 2.0
> > > >  * '''JUnit''': EPL
> > > >  * '''Mockito''': MIT
> > > >
> > > > == Cryptography ==
> > > >
> > > > Kudu does not currently include any cryptography-related code.
> > > >
> > > > == Required Resources ==
> > > >
> > > > === Mailing lists ===
> > > >
> > > >  * priv...@kudu.incubator.apache.org <javascript:;> (PMC)
> > > >  * comm...@kudu.incubator.apache.org <javascript:;> (git push
> emails)
> > > >  * iss...@kudu.incubator.apache.org <javascript:;> (JIRA issue feed)
> > > >  * d...@kudu.incubator.apache.org <javascript:;> (Gerrit code reviews
> > plus dev
> > > discussion)
> > > >  * u...@kudu.incubator.apache.org <javascript:;> (User questions)
> > > >
> > > >
> > > > === Repository ===
> > > >
> > > >  * git://git.apache.org/kudu
> > > >
> > > > === Gerrit ===
> > > >
> > > > We hope to continue using Gerrit for our code review and commit
> > workflow.
> > > > The Kudu team has already been in contact with Jake Farrell to start
> > > > discussions on how Gerrit can fit into the ASF. We know that several
> > > other
> > > > ASF projects and podlings are also interested in Gerrit.
> > > >
> > > >
> > > >
> > > > If the Infrastructure team does not have the bandwidth to support
> > Gerrit,
> > > > we will continue to support our own instance of Gerrit for Kudu, and
> > make
> > > > the necessary integrations such that commits are properly
> authenticated
> > > and
> > > > maintain sufficient provenance to uphold the ASF standards (e.g. via
> > the
> > > > solution adopted by the AsterixDB podling).
> > > >
> > > > == Issue Tracking ==
> > > >
> > > > We would like to import our current JIRA project into the ASF JIRA,
> > such
> > > > that our historical commit messages and code comments continue to
> > > reference
> > > > the appropriate bug numbers.
> > > >
> > > > == Initial Committers ==
> > > >
> > > >  * Adar Dembo a...@cloudera.com <javascript:;>
> > > >  * Alex Feinberg a...@strlen.net <javascript:;>
> > > >  * Andrew Wang w...@apache.org <javascript:;>
> > > >  * Dan Burkert d...@cloudera.com <javascript:;>
> > > >  * David Alves dral...@apache.org <javascript:;>
> > > >  * Jean-Daniel Cryans jdcry...@apache.org <javascript:;>
> > > >  * Mike Percy mpe...@apache.org <javascript:;>
> > > >  * Misty Stanley-Jones mi...@apache.org <javascript:;>
> > > >  * Todd Lipcon t...@apache.org <javascript:;>
> > > >
> > > > The initial list of committers was seeded by listing those
> contributors
> > > who
> > > > have contributed 20 or more patches in the last 12 months, indicating
> > > that
> > > > they are active and have achieved merit through participation on the
> > > > project. We chose not to include other contributors who either have
> not
> > > yet
> > > > contributed a significant number of patches, or whose contributions
> are
> > > far
> > > > in the past and we don’t expect to be active within the ASF.
> > > >
> > > > == Affiliations ==
> > > >
> > > >  * Adar Dembo - Cloudera
> > > >  * Alex Feinberg - Forward Networks
> > > >  * Andrew Wang - Cloudera
> > > >  * Dan Burkert - Cloudera
> > > >  * David Alves - Cloudera
> > > >  * Jean-Daniel Cryans - Cloudera
> > > >  * Mike Percy - Cloudera
> > > >  * Misty Stanley-Jones - Cloudera
> > > >  * Todd Lipcon - Cloudera
> > > >
> > > > == Sponsors ==
> > > >
> > > > === Champion ===
> > > >
> > > >  * Todd Lipcon
> > > >
> > > > === Nominated Mentors ===
> > > >
> > > >  * Jake Farrell - ASF Member and Infra team member, Acquia
> > > >  * Brock Noland - ASF Member, StreamSets
> > > >  * Michael Stack - ASF Member, Cloudera
> > > >  * Jarek Jarcec Cecho - ASF Member, Cloudera
> > > >  * Chris Mattmann - ASF Member, NASA JPL and USC
> > > >  * Julien Le Dem - Incubator PMC, Dremio
> > > >  * Carl Steinbach - ASF Member, LinkedIn
> > > >
> > > > === Sponsoring Entity ===
> > > >
> > > > The Apache Incubator
> > > >
> > >
> >
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Reply via email to