+1 (binding)
Willem Jiang Blog: http://willemjiang.blogspot.com (English) http://jnn.iteye.com (Chinese) Twitter: willemjiang Weibo: 姜宁willem On Sat, Oct 28, 2017 at 2:12 AM, Pierre Smits <pierre.sm...@gmail.com> wrote: > +1 > > Best regards > > Pierre > > On Fri, 27 Oct 2017 at 13:57 Raphael Bircher <rbircherapa...@gmail.com> > wrote: > > > +1 (binding) > > > > Am .10.2017, 18:01 Uhr, schrieb Luciano Resende <luckbr1...@gmail.com>: > > > > > Off course, my + 1 > > > > > > On Thu, Oct 26, 2017 at 12:31 PM, Luciano Resende < > luckbr1...@gmail.com> > > > wrote: > > > > > >> Now that the discussion thread on the Crail proposal has ended, please > > >> vote on accepting Crail into into the Apache Incubator. > > >> > > >> The ASF voting rules are described at: > > >> http://www.apache.org/foundation/voting.html > > >> > > >> A vote for accepting a new Apache Incubator podling is a majority vote > > >> for which only Incubator PMC member votes are binding. > > >> > > >> Votes from other people are also welcome as an indication of peoples > > >> enthusiasm (or lack thereof). > > >> > > >> Please do not use this VOTE thread for discussions. > > >> If needed, start a new thread instead. > > >> > > >> This vote will run for at least 72 hours. Please VOTE as follows > > >> [] +1 Accept Crail into the Apache Incubator > > >> [] +0 Abstain. > > >> [] -1 Do not accept Crail into the Apache Incubator because ... > > >> > > >> The proposal below is also on the wiki: > > >> https://wiki.apache.org/incubator/CrailProposal > > >> > > >> === > > >> > > >> Abstract > > >> > > >> Crail is a storage platform for sharing performance critical data in > > >> distributed data processing jobs at very high speed. Crail is built > > >> entirely upon principles of user-level I/O and specifically targets > data > > >> center deployments with fast network and storage hardware (e.g., > 100Gbps > > >> RDMA, plenty of DRAM, NVMe flash, etc.) as well as new modes of > > >> operation > > >> such resource disaggregation or serverless computing. Crail is written > > >> in > > >> Java and integrates seamlessly with the Apache data processing > > >> ecosystem. > > >> It can be used as a backbone to accelerate high-level data operations > > >> such > > >> as shuffle or broadcast, or as a cache to store hot data that is > queried > > >> repeatedly, or as a storage platform for sharing inter-job data in > > >> complex > > >> multi-job pipelines, etc. > > >> > > >> Proposal > > >> > > >> Crail enables Apache data processing frameworks to run efficiently in > > >> next > > >> generation data centers using fast storage and network hardware in > > >> combination with resource (e.g., DRAM, Flash) disaggregation. > > >> > > >> Background > > >> > > >> Crail started as a research project at the IBM Zurich Research > > >> Laboratory > > >> around 2014 aiming to integrate high-speed I/O hardware effectively > into > > >> large scale data processing systems. > > >> > > >> Rational > > >> > > >> During the last decade, I/O hardware has undergone rapid performance > > >> improvements, typically in the order of magnitudes. Modern day > > >> networking > > >> and storage hardware can deliver 100+ Gbps (10+ GBps) bandwidth with a > > >> few > > >> microseconds of access latencies. However, despite such progress in > raw > > >> I/O > > >> performance, effectively leveraging modern hardware in data processing > > >> frameworks remains challenging. In most of the cases, upgrading to > > >> high-end > > >> networking or storage hardware has very little effect on the > > >> performance of > > >> analytics workloads. The problem comes from heavily layered software > > >> imposing overheads such as deep call stacks, unnecessary data copies, > > >> thread contention, etc. These problems have already been addressed at > > >> the > > >> operating system level with new I/O APIs such as RDMA verbs, NVMe, > etc., > > >> allowing applications to bypass software layers during I/O operations. > > >> Distributed data processing frameworks on the other hand, are > typically > > >> implemented on legacy I/O interfaces such as such as sockets or block > > >> storage. These interfaces have been shown to be insufficient to > deliver > > >> the > > >> full hardware performance. Yet, to the best of our knowledge, there > are > > >> no > > >> active and systematic efforts to integrate these new user level I/O > APIs > > >> into Apache software frameworks. This problem affects all end-users > and > > >> organizations that use Apache software. We expect them to see > > >> unsatisfactory small performance gains when upgrading their networking > > >> and > > >> storage hardware. > > >> > > >> Crail solves this problem by providing an efficient storage platform > > >> built > > >> upon user-level I/O, thus, bypassing layers such as JVM and OS during > > >> I/O > > >> operations. Moreover, Crail directly leverages the specific hardware > > >> features of RDMA and NVMe to provide a better integration with > > >> high-level > > >> data operations in Apache compute frameworks. As a consequence, Crail > > >> enables users to run larger, more complex queries against ever > > >> increasing > > >> amounts of data at a speed largely determined by the deployed > hardware. > > >> Crail is generic solution that integrates well with the Apache > ecosystem > > >> including frameworks like Spark, Hadoop, Hive, etc. > > >> > > >> Initial Goals > > >> > > >> The initial goals to move Crail to the Apache Incubator is to broaden > > >> the > > >> community, and foster contributions from developers to leverage Crail > in > > >> various data processing frameworks and workloads. Ultimately, the goal > > >> for > > >> Crail is to become the de-facto standard platform for storing > temporary > > >> performance critical data in distributed data processing systems. > > >> > > >> Current Status > > >> > > >> The initial code has been developed at the IBM Zurich Research Center > > >> and > > >> has recently been made available in GitHub under the Apache Software > > >> License 2.0. The Project currently has explicit support for Spark and > > >> Hadoop. Project documentation is available on the website > www.crail.io. > > >> There is also a public forum for discussions related to Crail > available > > >> at > > >> https://groups.google.com/forum/#!forum/zrlio-users. > > >> > > >> Mericrotacy > > >> > > >> The current developers are familiar with the meritocratic open source > > >> development process at Apache. Over the last year, the project has > > >> gathered > > >> interest at GitHub and several companies have already expressed > > >> interest in > > >> the project. We plan to invest in supporting a meritocracy by inviting > > >> additional developers to participate. > > >> > > >> Community > > >> > > >> The need for a generic solution to integrate high-performance I/O > > >> hardware > > >> in the open source is tremendous, so there is a potential for a very > > >> large > > >> community. We believe that Crail’s extensible architecture and its > > >> alignment with the Apache Ecosystem will further encourage community > > >> participation. We expect that over time Crail will attract a large > > >> community. > > >> > > >> Alignment > > >> > > >> Crail is written in Java and is built for the Apache data processing > > >> ecosystem. The basic storage services of Crail can be used seamlessly > > >> from > > >> Spark, Hadoop, Storm. The enhanced storage services require dedicated > > >> data > > >> processing specific binding, which currently are available only for > > >> Spark. > > >> We think that moving Crail to the Apache incubator will help to extend > > >> Crail’s support for different data processing frameworks. > > >> > > >> Known Risks > > >> > > >> To-date, development has been sponsored by IBM and coordinated mostly > by > > >> the core team of researchers at the IBM Zurich Research Center. For > > >> Crail > > >> to fully transition to an "Apache Way" governance model, it needs to > > >> start > > >> embracing the meritocracy-centric way of growing the community of > > >> contributors. > > >> > > >> Orphaned Products > > >> > > >> The Crail developers have a long-term interest in use and maintenance > of > > >> the code and there is also hope that growing a diverse community > around > > >> the > > >> project will become a guarantee against the project becoming orphaned. > > >> We > > >> feel that it is also important to put formal governance in place both > > >> for > > >> the project and the contributors as the project expands. We feel ASF > is > > >> the > > >> best location for this. > > >> > > >> Inexperience with Open Source > > >> > > >> Several of the initial committers are experienced open source > developers > > >> (Linux Kernel, DPDK, etc.). > > >> > > >> Relationships with Other Apache Products > > >> > > >> As of now, Crail has been tested with Spark, Hadoop and Hive, but it > is > > >> designed to integrate with any of the Apache data processing > frameworks. > > >> > > >> Homogeneous Developers > > >> > > >> The project already has a diverse developer base including > contributions > > >> from organizations and public developers. > > >> > > >> An Excessive Fascination with the Apache Brand > > >> > > >> Crail solves a real need for a generic approach to leverage modern > > >> network > > >> and storage hardware effectively in the Apache Hadoop and Spark > > >> ecosystems. > > >> Our rationale for developing Crail as an Apache project is detailed in > > >> the > > >> Rationale section. We believe that the Apache brand and community > > >> process > > >> will help to us to engage a larger community and facilitate closer > ties > > >> with various Apache data processing projects. > > >> > > >> Documentation > > >> > > >> Documentation regarding Crail is available at www.crail.io > > >> > > >> Initial Source > > >> > > >> Initial source is available on GitHub under the Apache License 2.0: > > >> > > >> https://github.com/zrlio/crail > > >> External Dependencies > > >> > > >> Crail is written in Java and currently supports Apache Hadoop > MapReduce > > >> and Apache Spark runtimes. To the best of our knowledge, all > > >> dependencies > > >> of Crail are distributed under Apache compatible licenses. > > >> > > >> Required Resource > > >> > > >> Mailing lists > > >> > > >> priv...@crail.incubator.apache.org > > >> d...@crail.incubator.apache.org > > >> comm...@crail.incubator.apache.org > > >> Git repository > > >> > > >> https://git-wip-us.apache.org/repos/asf/incubator-crail.git > > >> Issue Tracking > > >> > > >> JIRA (Crail) > > >> Initial Committers > > >> > > >> Patrick Stuedi <stu AT ibm DOT zurich DOT com> > > >> Animesh Trivedi <atr AT ibm DOT zurich DOT com> > > >> Jonas Pfefferle <jpf AT ibm DOT zurich DOT com> > > >> Bernard Metzler <bmt AT ibm DOT zurich DOT com> > > >> Michael Kaufmann <kau AT ibm DOT zurich DOT com> > > >> Adrian Schuepbach <dri AT ibm DOT zurich DOT com> > > >> Patrick McArthur <patrick AT patrickmcarthur DOT net> > > >> Ana Klimovic <anakli AT stanford DOT edu> > > >> Yuval Degani <yuvaldeg AT mellanox DOT com> > > >> Vu Pham <vuhuong AT mellanox DOT com> > > >> Affiliations > > >> > > >> IBM (Patrick, Stuedi, Animesh Trivedi, Jonas Pfefferle, Bernard > Metzler, > > >> Michael Kaufmann, Adrian Schuepbach) > > >> University of New Hampshire (Patrick McArthur) > > >> Stanford University (Ana Klimovic) > > >> Mellanox (Yuval Degani, Vu Pham) > > >> Sponsors > > >> > > >> Champion > > >> > > >> Luciano Resende <lresende AT apache DOT org> > > >> > > >> Nominated Mentors > > >> > > >> Luciano Resende <lresende AT apache DOT org> > > >> > > >> Raphael Bircher <rbircher AT apache DOT org> > > >> > > >> Julian Hyde <jhyde AT apache DOT org> > > >> > > >> Sponsoring Entity > > >> > > >> We would like to propose the Apache Incubator to sponsor this project. > > >> > > >> > > >> -- > > >> Luciano Resende > > >> http://twitter.com/lresende1975 > > >> http://lresende.blogspot.com/ > > >> > > > > > > > > > > > > > > > -- > > My introduction https://youtu.be/Ln4vly5sxYU > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > -- > Pierre Smits > > ORRTIZ.COM <http://www.orrtiz.com> > OFBiz based solutions & services > > OFBiz Extensions Marketplace > http://oem.ofbizci.net/oci-2/ >