Re: [RESULT] [VOTE] Accept Crail into the Apache Incubator

Jitendra Pandey Fri, 03 Nov 2017 15:04:17 -0700

+1

On 11/1/17, 7:40 AM, "Craig Russell" <apache....@gmail.com> wrote:


    Subject line change to close the vote.
    
    > On Nov 1, 2017, at 6:42 AM, Luciano Resende <luckbr1...@gmail.com> wrote:
    > 
    > On Thu, Oct 26, 2017 at 8:31 AM, Luciano Resende <luckbr1...@gmail.com>
    > wrote:
    > 
    >> Now that the discussion thread on the Crail proposal has ended, please
    >> vote on accepting Crail into into the Apache Incubator.
    >> 
    >> The ASF voting rules are described at:
    >>   http://www.apache.org/foundation/voting.html
    >> 
    >> A vote for accepting a new Apache Incubator podling is a majority vote
    >> for which only Incubator PMC member votes are binding.
    >> 
    >> Votes from other people are also welcome as an indication of peoples
    >> enthusiasm (or lack thereof).
    >> 
    >> Please do not use this VOTE thread for discussions.
    >> If needed, start a new thread instead.
    >> 
    >> This vote will run for at least 72 hours. Please VOTE as follows
    >> [] +1 Accept Crail into the Apache Incubator
    >> [] +0 Abstain.
    >> [] -1 Do not accept Crail into the Apache Incubator because ...
    >> 
    >> The proposal below is also on the wiki:
    >> https://wiki.apache.org/incubator/CrailProposal
    >> 
    >> ===
    >> 
    >> Abstract
    >> 
    >> Crail is a storage platform for sharing performance critical data in
    >> distributed data processing jobs at very high speed. Crail is built
    >> entirely upon principles of user-level I/O and specifically targets data
    >> center deployments with fast network and storage hardware (e.g., 100Gbps
    >> RDMA, plenty of DRAM, NVMe flash, etc.) as well as new modes of operation
    >> such resource disaggregation or serverless computing. Crail is written in
    >> Java and integrates seamlessly with the Apache data processing ecosystem.
    >> It can be used as a backbone to accelerate high-level data operations 
such
    >> as shuffle or broadcast, or as a cache to store hot data that is queried
    >> repeatedly, or as a storage platform for sharing inter-job data in 
complex
    >> multi-job pipelines, etc.
    >> 
    >> Proposal
    >> 
    >> Crail enables Apache data processing frameworks to run efficiently in 
next
    >> generation data centers using fast storage and network hardware in
    >> combination with resource (e.g., DRAM, Flash) disaggregation.
    >> 
    >> Background
    >> 
    >> Crail started as a research project at the IBM Zurich Research Laboratory
    >> around 2014 aiming to integrate high-speed I/O hardware effectively into
    >> large scale data processing systems.
    >> 
    >> Rational
    >> 
    >> During the last decade, I/O hardware has undergone rapid performance
    >> improvements, typically in the order of magnitudes. Modern day networking
    >> and storage hardware can deliver 100+ Gbps (10+ GBps) bandwidth with a 
few
    >> microseconds of access latencies. However, despite such progress in raw 
I/O
    >> performance, effectively leveraging modern hardware in data processing
    >> frameworks remains challenging. In most of the cases, upgrading to 
high-end
    >> networking or storage hardware has very little effect on the performance 
of
    >> analytics workloads. The problem comes from heavily layered software
    >> imposing overheads such as deep call stacks, unnecessary data copies,
    >> thread contention, etc. These problems have already been addressed at the
    >> operating system level with new I/O APIs such as RDMA verbs, NVMe, etc.,
    >> allowing applications to bypass software layers during I/O operations.
    >> Distributed data processing frameworks on the other hand, are typically
    >> implemented on legacy I/O interfaces such as such as sockets or block
    >> storage. These interfaces have been shown to be insufficient to deliver 
the
    >> full hardware performance. Yet, to the best of our knowledge, there are 
no
    >> active and systematic efforts to integrate these new user level I/O APIs
    >> into Apache software frameworks. This problem affects all end-users and
    >> organizations that use Apache software. We expect them to see
    >> unsatisfactory small performance gains when upgrading their networking 
and
    >> storage hardware.
    >> 
    >> Crail solves this problem by providing an efficient storage platform 
built
    >> upon user-level I/O, thus, bypassing layers such as JVM and OS during I/O
    >> operations. Moreover, Crail directly leverages the specific hardware
    >> features of RDMA and NVMe to provide a better integration with high-level
    >> data operations in Apache compute frameworks. As a consequence, Crail
    >> enables users to run larger, more complex queries against ever increasing
    >> amounts of data at a speed largely determined by the deployed hardware.
    >> Crail is generic solution that integrates well with the Apache ecosystem
    >> including frameworks like Spark, Hadoop, Hive, etc.
    >> 
    >> Initial Goals
    >> 
    >> The initial goals to move Crail to the Apache Incubator is to broaden the
    >> community, and foster contributions from developers to leverage Crail in
    >> various data processing frameworks and workloads. Ultimately, the goal 
for
    >> Crail is to become the de-facto standard platform for storing temporary
    >> performance critical data in distributed data processing systems.
    >> 
    >> Current Status
    >> 
    >> The initial code has been developed at the IBM Zurich Research Center and
    >> has recently been made available in GitHub under the Apache Software
    >> License 2.0. The Project currently has explicit support for Spark and
    >> Hadoop. Project documentation is available on the website www.crail.io.
    >> There is also a public forum for discussions related to Crail available 
at
    >> https://groups.google.com/forum/#!forum/zrlio-users.
    >> 
    >> Mericrotacy
    >> 
    >> The current developers are familiar with the meritocratic open source
    >> development process at Apache. Over the last year, the project has 
gathered
    >> interest at GitHub and several companies have already expressed interest 
in
    >> the project. We plan to invest in supporting a meritocracy by inviting
    >> additional developers to participate.
    >> 
    >> Community
    >> 
    >> The need for a generic solution to integrate high-performance I/O 
hardware
    >> in the open source is tremendous, so there is a potential for a very 
large
    >> community. We believe that Crail’s extensible architecture and its
    >> alignment with the Apache Ecosystem will further encourage community
    >> participation. We expect that over time Crail will attract a large
    >> community.
    >> 
    >> Alignment
    >> 
    >> Crail is written in Java and is built for the Apache data processing
    >> ecosystem. The basic storage services of Crail can be used seamlessly 
from
    >> Spark, Hadoop, Storm. The enhanced storage services require dedicated 
data
    >> processing specific binding, which currently are available only for 
Spark.
    >> We think that moving Crail to the Apache incubator will help to extend
    >> Crail’s support for different data processing frameworks.
    >> 
    >> Known Risks
    >> 
    >> To-date, development has been sponsored by IBM and coordinated mostly by
    >> the core team of researchers at the IBM Zurich Research Center. For Crail
    >> to fully transition to an "Apache Way" governance model, it needs to 
start
    >> embracing the meritocracy-centric way of growing the community of
    >> contributors.
    >> 
    >> Orphaned Products
    >> 
    >> The Crail developers have a long-term interest in use and maintenance of
    >> the code and there is also hope that growing a diverse community around 
the
    >> project will become a guarantee against the project becoming orphaned. We
    >> feel that it is also important to put formal governance in place both for
    >> the project and the contributors as the project expands. We feel ASF is 
the
    >> best location for this.
    >> 
    >> Inexperience with Open Source
    >> 
    >> Several of the initial committers are experienced open source developers
    >> (Linux Kernel, DPDK, etc.).
    >> 
    >> Relationships with Other Apache Products
    >> 
    >> As of now, Crail has been tested with Spark, Hadoop and Hive, but it is
    >> designed to integrate with any of the Apache data processing frameworks.
    >> 
    >> Homogeneous Developers
    >> 
    >> The project already has a diverse developer base including contributions
    >> from organizations and public developers.
    >> 
    >> An Excessive Fascination with the Apache Brand
    >> 
    >> Crail solves a real need for a generic approach to leverage modern 
network
    >> and storage hardware effectively in the Apache Hadoop and Spark 
ecosystems.
    >> Our rationale for developing Crail as an Apache project is detailed in 
the
    >> Rationale section. We believe that the Apache brand and community process
    >> will help to us to engage a larger community and facilitate closer ties
    >> with various Apache data processing projects.
    >> 
    >> Documentation
    >> 
    >> Documentation regarding Crail is available at www.crail.io
    >> 
    >> Initial Source
    >> 
    >> Initial source is available on GitHub under the Apache License 2.0:
    >> 
    >> https://github.com/zrlio/crail
    >> External Dependencies
    >> 
    >> Crail is written in Java and currently supports Apache Hadoop MapReduce
    >> and Apache Spark runtimes. To the best of our knowledge, all dependencies
    >> of Crail are distributed under Apache compatible licenses.
    >> 
    >> Required Resource
    >> 
    >> Mailing lists
    >> 
    >> priv...@crail.incubator.apache.org
    >> d...@crail.incubator.apache.org
    >> comm...@crail.incubator.apache.org
    >> Git repository
    >> 
    >> https://git-wip-us.apache.org/repos/asf/incubator-crail.git
    >> Issue Tracking
    >> 
    >> JIRA (Crail)
    >> Initial Committers
    >> 
    >> Patrick Stuedi <stu AT ibm DOT zurich DOT com>
    >> Animesh Trivedi <atr AT ibm DOT zurich DOT com>
    >> Jonas Pfefferle <jpf AT ibm DOT zurich DOT com>
    >> Bernard Metzler <bmt AT ibm DOT zurich DOT com>
    >> Michael Kaufmann <kau AT ibm DOT zurich DOT com>
    >> Adrian Schuepbach <dri AT ibm DOT zurich DOT com>
    >> Patrick McArthur <patrick AT patrickmcarthur DOT net>
    >> Ana Klimovic <anakli AT stanford DOT edu>
    >> Yuval Degani <yuvaldeg AT mellanox DOT com>
    >> Vu Pham <vuhuong AT mellanox DOT com>
    >> Affiliations
    >> 
    >> IBM (Patrick, Stuedi, Animesh Trivedi, Jonas Pfefferle, Bernard Metzler,
    >> Michael Kaufmann, Adrian Schuepbach)
    >> University of New Hampshire (Patrick McArthur)
    >> Stanford University (Ana Klimovic)
    >> Mellanox (Yuval Degani, Vu Pham)
    >> Sponsors
    >> 
    >> Champion
    >> 
    >> Luciano Resende <lresende AT apache DOT org>
    >> 
    >> Nominated Mentors
    >> 
    >> Luciano Resende <lresende AT apache DOT org>
    >> 
    >> Raphael Bircher <rbircher AT apache DOT org>
    >> 
    >> Julian Hyde <jhyde AT apache DOT org>
    >> 
    >> Sponsoring Entity
    >> 
    >> We would like to propose the Apache Incubator to sponsor this project.
    >> 
    >> 
    >> 
    > 
    > The vote has passed with 5 binding + 1 from:
    > 
    > Luciano Resende
    > Julian Hyde
    > Raphael Bircher
    > Willem Jiang
    > Dave Fisher
    > 
    > And 5 non-binding +1 from
    > 
    > Clebert Suconic
    > Gang(Gary) Wang
    > Debo Dutta (dedutta)
    > Kacie Karo
    > Pierre Smits
    > 
    > Thanks and Welcome to the Apache Incubator.
    > 
    > -- 
    > Luciano Resende
    > http://twitter.com/lresende1975
    > http://lresende.blogspot.com/
    
    Craig L Russell
    Secretary, Apache Software Foundation
    c...@apache.org http://db.apache.org/jdo
    
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
    For additional commands, e-mail: general-h...@incubator.apache.org

Re: [RESULT] [VOTE] Accept Crail into the Apache Incubator

Reply via email to