Added :) On Thu, Jan 12, 2017 at 1:43 AM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
> Hi Henri, > > My previous comment was just a review comment against the proposal, > but I forgot to mentioning importance thing. > > > Currently the list of committers is based on the current active coders, > so > > we're also very interested in hearing from anyone else who is interested > in > > working on the project, be they current or future contributor! > > I'm also interested in working on MXNet :-) > > Thanks, > - Tsuyoshi > > On Thu, Jan 12, 2017 at 3:43 PM, 项亮 <xlvec...@gmail.com> wrote: > > I would like to volunteer as a committer for MXNet > > > > github id: xlvector > > email: xlvec...@gmail.com > > > > Liang Xiang from Toutiao Lab > > > > On 2017-01-06 13:12 (+0800), Henri Yandell <bay...@apache.org> wrote: > >> Hello Incubator, > >> > >> I'd like to propose a new incubator Apache MXNet podling. > >> > >> The existing MXNet project (http://mxnet.io - 1.5 years old, 15 > committers, > >> 200 contributors) is very interested in joining Apache. MXNet is an > >> open-source deep learning framework that allows you to define, train, > and > >> deploy deep neural networks on a wide array of devices, from cloud > >> infrastructure to mobile devices. > >> > >> The wiki proposal page is located here: > >> > >> https://wiki.apache.org/incubator/MXNetProposal > >> > >> I've included the text below in case anyone wants to focus on parts of > it > >> in a reply. > >> > >> Looking forward to your thoughts, and for lots of interested Apache > members > >> to volunteer to mentor the project in addition to Sebastian and myself. > >> > >> Currently the list of committers is based on the current active coders, > so > >> we're also very interested in hearing from anyone else who is > interested in > >> working on the project, be they current or future contributor! > >> > >> Thanks, > >> > >> Hen > >> On behalf of the MXNet project > >> > >> --------- > >> > >> = MXNet: Apache Incubator Proposal = > >> > >> == Abstract == > >> > >> MXNet is a Flexible and Efficient Library for Deep Learning > >> > >> == Proposal == > >> > >> MXNet is an open-source deep learning framework that allows you to > define, > >> train, and deploy deep neural networks on a wide array of devices, from > >> cloud infrastructure to mobile devices. It is highly scalable, allowing > for > >> fast model training, and supports a flexible programming model and > multiple > >> languages. MXNet allows you to mix symbolic and imperative programming > >> flavors to maximize both efficiency and productivity. MXNet is built on > a > >> dynamic dependency scheduler that automatically parallelizes both > symbolic > >> and imperative operations on the fly. A graph optimization layer on top > of > >> that makes symbolic execution fast and memory efficient. The MXNet > library > >> is portable and lightweight, and it scales to multiple GPUs and multiple > >> machines. > >> > >> == Background == > >> > >> Deep learning is a subset of Machine learning and refers to a class of > >> algorithms that use a hierarchical approach with non-linearities to > >> discover and learn representations within data. Deep Learning has > recently > >> become very popular due to its applicability and advancement of domains > >> such as Computer Vision, Speech Recognition, Natural Language > Understanding > >> and Recommender Systems. With pervasive and cost effective cloud > computing, > >> large labeled datasets and continued algorithmic innovation, Deep > Learning > >> has become the one of the most popular classes of algorithms for machine > >> learning practitioners in recent years. > >> > >> == Rational == > >> > >> The adoption of deep learning is quickly expanding from initial deep > domain > >> experts rooted in academia to data scientists and developers working to > >> deploy intelligent services and products. Deep learning however has many > >> challenges. These include model training time (which can take days to > >> weeks), programmability (not everyone writes Python or C++ and like > >> symbolic programming) and balancing production readiness (support for > >> things like failover) with development flexibility (ability to program > >> different ways, support for new operators and model types) and speed of > >> execution (fast and scalable model training). Other frameworks excel on > >> some but not all of these aspects. > >> > >> > >> == Initial Goals == > >> > >> MXNet is a fairly established project on GitHub with its first code > >> contribution in April 2015 and roughly 200 contributors. It is used by > >> several large companies and some of the top research institutions on the > >> planet. Initial goals would be the following: > >> > >> 1. Move the existing codebase(s) to Apache > >> 1. Integrate with the Apache development process/sign CLAs > >> 1. Ensure all dependencies are compliant with Apache License version > 2.0 > >> 1. Incremental development and releases per Apache guidelines > >> 1. Establish engineering discipline and a predictable release cadence > of > >> high quality releases > >> 1. Expand the community beyond the current base of expert level users > >> 1. Improve usability and the overall developer/user experience > >> 1. Add additional functionality to address newer problem types and > >> algorithms > >> > >> > >> == Current Status == > >> > >> === Meritocracy === > >> > >> The MXNet project already operates on meritocratic principles. Today, > MXNet > >> has developers worldwide and has accepted multiple major patches from a > >> diverse set of contributors within both industry and academia. We would > >> like to follow ASF meritocratic principles to encourage more developers > to > >> contribute in this project. We know that only active and committed > >> developers from a diverse set of backgrounds can make MXNet a successful > >> project. We are also improving the documentation and code to help new > >> developers get started quickly. > >> > >> === Community === > >> > >> Acceptance into the Apache foundation would bolster the growing user and > >> developer community around MXNet. That community includes around 200 > >> contributors from academia and industry. The core developers of our > project > >> are listed in our contributors below and are also represented by logos > on > >> the mxnet.io site including Amazon, Baidu, Carnegie Mellon University, > >> Turi, Intel, NYU, Nvidia, MIT, Microsoft, TuSimple, University of > Alberta, > >> University of Washington and Wolfram. > >> > >> === Core Developers === > >> > >> (with GitHub logins) > >> > >> * Tianqi Chen (@tqchen) > >> * Mu Li (@mli) > >> * Junyuan Xie (@piiswrong) > >> * Bing Xu (@antinucleon) > >> * Chiyuan Zhang (@pluskid) > >> * Minjie Wang (@jermainewang) > >> * Naiyan Wang (@winstywang) > >> * Yizhi Liu (@javelinjs) > >> * Tong He (@hetong007) > >> * Qiang Kou (@thirdwing) > >> * Xingjian Shi (@sxjscience) > >> > >> === Alignment === > >> > >> ASF is already the home of many distributed platforms, e.g., Hadoop, > Spark > >> and Mahout, each of which targets a different application domain. MXNet, > >> being a distributed platform for large-scale deep learning, focuses on > >> another important domain for which there still lacks a scalable, > >> programmable, flexible and super fast open-source platform. The recent > >> success of deep learning models especially for vision and speech > >> recognition tasks has generated interests in both applying existing deep > >> learning models and in developing new ones. Thus, an open-source > platform > >> for deep learning backed by some of the top industry and academic > players > >> will be able to attract a large community of users and developers. > MXNet is > >> a complex system needing many iterations of design, implementation and > >> testing. Apache's collaboration framework which encourages active > >> contribution from developers will inevitably help improve the quality of > >> the system, as shown in the success of Hadoop, Spark, etc. Equally > >> important is the community of users which helps identify real-life > >> applications of deep learning, and helps to evaluate the system's > >> performance and ease-of-use. We hope to leverage ASF for coordinating > and > >> promoting both communities, and in return benefit the communities with > >> another useful tool. > >> > >> == Known Risks == > >> > >> === Orphaned products === > >> > >> Given the current level of investment in MXNet and the stakeholders > using > >> it - the risk of the project being abandoned is minimal. Amazon, for > >> example, is in active development to use MXNet in many of its services > and > >> many large corporations use it in their production applications. > >> > >> === Inexperience with Open Source === > >> > >> MXNet has existed as a healthy open source project for more than a year. > >> During that time, the project has attracted 200+ contributors. > >> > >> === Homogenous Developers === > >> > >> The initial list of committers and contributors includes developers from > >> several institutions and industry participants (see above). > >> > >> === Reliance on Salaried Developers === > >> > >> Like most open source projects, MXNet receives a substantial support > from > >> salaried developers. A large fraction of MXNet development is supported > by > >> graduate students at various universities in the course of research > degrees > >> - this is more a “volunteer” relationship, since in most cases students > >> contribute vastly more than is necessary to immediately support > research. > >> In addition, those working from within corporations are devoting > >> significant time and effort in the project - and these come from several > >> organizations. > >> > >> === A Excessive Fascination with the Apache Brand === > >> > >> We choose Apache not for publicity. We have two purposes. First, we hope > >> that Apache's known best-practices for managing a mature open source > >> project can help guide us. For example, we are feeling the growing > pains > >> of a successful open source project as we attempt a major refactor of > the > >> internals while customers are using the system in production. We seek > >> guidance in communicating breaking API changes and version revisions. > >> Also, as our involvement from major corporations increases, we want to > >> assure our users that MXNet will stay open and not favor any particular > >> platform or environment. These are some examples of the know-how and > >> discipline we're hoping Apache can bring to our project. > >> > >> Second, we want to leverage Apache's reputation to recruit more > developers > >> to create a diverse community. > >> > >> === Relationship with Other Apache Products === > >> > >> Apache Mahout and Apache Spark's MLlib are general machine learning > >> systems. Deep learning algorithms can thus be implemented on these two > >> platforms as well. However, in practice, the overlap will be minimal. > Deep > >> learning is so computationally intensive that it often requires > specialized > >> GPU hardware to accomplish tasks of meaningful size. Making efficient > use > >> of GPU hardware is complex because the hardware is so fast that the > >> supporting systems around it must be carefully optimized to keep the GPU > >> cores busy. Extending this capability to distributed multi-GPU and > >> multi-host environments requires great care. This is a critical > >> differentiator between MXNet and existing Apache machine learning > systems. > >> > >> Mahout and Spark ML-LIB follow models where their nodes run > synchronously. > >> This is the fundamental difference to MXNet who follows the parameter > >> server framework. MXNet can run synchronously or asynchronously. In > >> addition, MXNet has optimizations for training a wide range of deep > >> learning models using a variety of approaches (e.g., model parallelism > and > >> data parallelism) which makes MXNet much more efficient (near-linear > >> speedup on state of the art models). MXNet also supports both imperative > >> and symbolic approaches providing ease of programming for deep learning > >> algorithms. > >> > >> Other Apache projects that are potentially complimentary: > >> > >> Apache Arrow - read data in Apache Arrow‘s internal format from MXNet, > that > >> would allow users to run ETL/preprocessing in Spark, save the results in > >> Arrow’s format and then run DL algorithms on it. > >> > >> Apache Singa - MXNet and Singa are both deep learning projects, and can > >> benefit from a larger deep learning community at Apache. > >> > >> == Documentation == > >> > >> Documentation has recently migrated to http://mxnet.io. We continue to > >> refine and improve the documentation. > >> > >> == Initial Source == > >> > >> We currently use Github to maintain our source code, > >> https://github.com/MXNet > >> > >> == Source and Intellectual Property Submission Plan == > >> > >> MXNet Code is available under Apache License, Version 2.0. We will work > >> with the committers to get CLAs signed and review previous > contributions. > >> > >> == External Dependencies == > >> > >> * required by the core code base: GCC or CLOM, Clang, any BLAS library > >> (ATLAS, OpenBLAS, MKL), dmlc-core, mshadow, ps-lite (which requires > >> lib-zeromq), TBB > >> * required for GPU usage: cudnn, cuda > >> * required for python usage: Python 2/3 > >> * required for R module: R, Rcpp (GPLv2 licensing) > >> * optional for image preparation and preprocessing: opencv > >> * optional dependencies for additional features: torch7, numba, cython > (in > >> NNVM branch) > >> > >> Rcpt and lib-zeromq are expected to be licensing discussions. > >> > >> == Cryptography == > >> > >> Not Applicable > >> > >> == Required Resources == > >> > >> === Mailing Lists === > >> > >> There is currently no mailing list. > >> > >> === Issue Tracking === > >> > >> Currently uses GitHub to track issues. Would like to continue to do so. > >> > >> == Committers and Affiliations == > >> > >> * Tianqi Chen (UW) > >> * Mu Li (AWS) > >> * Junyuan Xie (AWS) > >> * Bing Xu (Apple) > >> * Chiyuan Zhang (MIT) > >> * Minjie Wang (UYU) > >> * Naiyan Wang (Tusimple) > >> * Yizhi Liu (Mediav) > >> * Tong He (Simon Fraser University) > >> * Qiang Kou (Indiana U) > >> * Xingjian Shi (HKUST) > >> > >> == Sponsors == > >> > >> === Champion === > >> > >> Henri Yandell (bayard at apache.org) > >> > >> === Nominated Mentors === > >> > >> Sebastian Schelter (s...@apache.org) > >> > >> > >> === Sponsoring Entity === > >> > >> We are requesting the Incubator to sponsor this project. > >> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >