I am just waiting on an accept from Alex to be in the initial committers list and will then update the proposal on the wiki.
On Mon, May 16, 2016 at 12:07 PM, Simon Chan <si...@salesforce.com> wrote: > Yes, it includes everyone who previously contributed code from PredictionIO > before the acquisition and still want to be involved in the project. > > We may have missed "Alex Merritt", going to add him to the list soon. > > Simon > > > On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <smar...@apache.org> > wrote: > > > I do have a question about the proposed list of committers. > > > > Does the list also include all of those folks who were with PredictionIO > > (and had contributed to the project) and then chose to leave when PIO was > > acquired by Salesforce? > > > > > > > > > > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <j...@nanthrax.net> > > wrote: > > > > > By the way, we have some discussion about integrating Zeppelin with > Beam > > ;) > > > > > > Regards > > > JB > > > > > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote: > > > > > >> Super excited to see this proposal! This will finally allow us to have > > >> an ASF managed > > >> backend for next generation data-driven apps that I see emerging quite > > >> rapidly. > > >> > > >> The proposal looks great to me (although I'd recommend calling Scala > > >> as an implementation > > >> language more prominently since it may attract additional developers > > >> with affinity to it). > > >> > > >> I do have two questions about technology: > > >> 1. do you think it would be possible to leverage Apache Beam > > >> (incubating) > > >> for abstracting away dependency on execution frameworks? My > > >> understanding > > >> is that PredictionIO currently only run on Spark. > > >> 2. is there a potential integration with Apache Zeppelin possible? > > >> > > >> Thanks, > > >> Roman. > > >> > > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <apurt...@apache.org> > > >> wrote: > > >> > > >>> Greetings, > > >>> > > >>> It is my pleasure to > > >>> > > >>> propose the PredictionIO project for incubation at the Apache > Software > > >>> Foundation. > > >>> > > >>> PredictionIO is a > > >>> popular > > >>> open > > >>> > > >>> source Machine Learning Server built on top of a state-of-the-art > open > > >>> source stack, including several Apache technologies, that > > >>> > > >>> enables developers to manage and deploy production-ready predictive > > >>> services for various kinds of machine learning tasks > > >>> , with more than 400 production deployments around the world and a > > >>> growing > > >>> contributor community. > > >>> > > >>> > > >>> The text of the proposal is included below and is also available at > > >>> https://wiki.apache.org/incubator/PredictionIO > > >>> > > >>> Best regards, > > >>> Andrew Purtell > > >>> > > >>> > > >>> = PredictionIO Proposal = > > >>> > > >>> === Abstract === > > >>> PredictionIO is an open source Machine Learning Server built on top > of > > >>> state-of-the-art open source stack, that enables developers to manage > > and > > >>> deploy production-ready predictive services for various kinds of > > machine > > >>> learning tasks. > > >>> > > >>> === Proposal === > > >>> The PredictionIO platform consists of the following components: > > >>> > > >>> * PredictionIO framework - provides the machine learning stack for > > >>> building, evaluating and deploying engines with machine learning > > >>> algorithms. It uses Apache Spark for processing. > > >>> > > >>> * Event Server - the machine learning analytics layer for unifying > > >>> events > > >>> from multiple platforms. It can use Apache HBase or any JDBC > backends > > >>> as its data store. > > >>> > > >>> The PredictionIO community also maintains a > > >>> > > >>> Template Gallery, a place to > > >>> publish and download (free or proprietary) engine templates for > > different > > >>> types of machine learning applications, and is a complemental part of > > the > > >>> project. At this point we exclude the Template Gallery from the > > proposal, > > >>> as it has a separate set of contributors and we’re not familiar with > an > > >>> Apache approved mechanism to maintain such a gallery. > > >>> > > >>> You can find the Template Gallery at > https://templates.prediction.io/ > > >>> > > >>> === Background === > > >>> PredictionIO was started with a mission to democratize and bring > > machine > > >>> learning to the masses. > > >>> > > >>> Machine learning has traditionally been a luxury for big companies > like > > >>> Google, Facebook, and Netflix. There are ML libraries and tools lying > > >>> around the internet but the effort of putting them all together as a > > >>> production-ready infrastructure is a very resource-intensive task > that > > is > > >>> remotely reachable by individuals or small businesses. > > >>> > > >>> PredictionIO is a production-ready, full stack machine learning > system > > >>> that > > >>> allows organizations of any scale to quickly deploy machine learning > > >>> capabilities. It comes with official and community-contributed > machine > > >>> learning engine templates that are easy to customize. > > >>> > > >>> === Rationale === > > >>> As usage and number of contributors to PredictionIO has grown bigger > > and > > >>> more diverse, we have sought for an independent framework for the > > project > > >>> to keep thriving. We believe the Apache foundation is a great fit. > > >>> Joining > > >>> Apache would ensure that tried and true processes and procedures are > in > > >>> place for the growing number of organizations interested in > > contributing > > >>> to PredictionIO. PredictionIO is also a good fit for the Apache > > >>> foundation. > > >>> PredictionIO was built on top of several Apache projects (HBase, > Spark, > > >>> Hadoop). We are familiar with the Apache process and believe that the > > >>> democratic and meritocratic nature of the foundation aligns with the > > >>> project goals. > > >>> > > >>> === Initial Goals === > > >>> The initial milestones will be to move the existing codebase to > Apache > > >>> and > > >>> integrate with the Apache development process. Once this is > > accomplished, > > >>> we plan for incremental development and releases that follow the > Apache > > >>> guidelines, as well as growing our developer and user communities. > > >>> > > >>> === Current Status === > > >>> PredictionIO has undergone nine minor releases and many patches. > > >>> PredictionIO is being used in production by Salesforce.com as well as > > >>> many > > >>> other organizations and apps. The PredictionIO codebase is currently > > >>> hosted at GitHub, which will form the basis of the Apache git > > repository. > > >>> > > >>> ==== Meritocracy ==== > > >>> We plan to invest in supporting a meritocracy. We will discuss the > > >>> requirements in an open forum. We intend to invite additional > > developers > > >>> to participate. We will encourage and monitor community participation > > so > > >>> that privileges can be extended to those that contribute. > > >>> > > >>> ==== Community ==== > > >>> Acceptance into the Apache foundation would bolster the already > strong > > >>> user and developer community around PredictionIO. That community > > includes > > >>> many contributors from various other companies, and an active mailing > > >>> list > > >>> composed of hundreds of users. > > >>> > > >>> ==== Core Developers ==== > > >>> The core developers of our project are listed in our contributors and > > >>> initial PPMC below. Though many are employed at Salesforce.com, there > > are > > >>> also engineers from ActionML, and independent developers. > > >>> > > >>> === Alignment === > > >>> The ASF is the natural choice to host the PredictionIO project as its > > >>> goal > > >>> is democratizing Machine Learning by making it more easily accessible > > to > > >>> every user/developer. PredictionIO is built on top of several top > level > > >>> Apache projects as outlined above. > > >>> > > >>> === Known Risks === > > >>> > > >>> ==== Orphaned products ==== > > >>> PredictionIO has a solid and growing community. It is deployed on > > >>> production environments by companies of all sizes to run various > kinds > > of > > >>> predictive engines. > > >>> > > >>> In addition to the community contribution to PredictionIO framework, > > the > > >>> community is also actively contributing new engines to the Template > > >>> Gallery as well as SDKs and documentation for the project. Salesforce > > is > > >>> committed to utilize and advance the PredictionIO code base and > support > > >>> its user community. > > >>> > > >>> ==== Inexperience with Open Source ==== > > >>> PredictionIO has existed as a healthy open source project for almost > > two > > >>> years and is the most starred Scala project on GitHub. All of the > > >>> proposed > > >>> committers have contributed to ASF and Linux Foundation open source > > >>> projects. Several current committers on Apache projects and Apache > > >>> Members > > >>> are involved in this proposal and intend to provide mentorship. > > >>> > > >>> ==== Homogeneous Developers ==== > > >>> The initial list of committers includes developers from several > > >>> institutions, including Salesforce, ActionML, Channel4, USC as well > as > > >>> unaffiliated developers. > > >>> > > >>> ==== Reliance on Salaried Developers ==== > > >>> Like most open source projects, PredictionIO receives substantial > > support > > >>> from salaried developers. PredictionIO development is partially > > supported > > >>> by Salesforce.com, but there are many contributors from various other > > >>> companies, and an active mailing list composed of hundreds of users. > We > > >>> will continue our efforts to ensure stewardship of the project to be > > >>> independent of salaried developers by meritocratically promoting > those > > >>> contributors to committers. > > >>> > > >>> ==== Relationships with Other Apache Product ==== > > >>> PredictionIO relies heavily on top level apache projects such as > Apache > > >>> Spark, HBase and Hadoop. However it brings a distinguished > > functionality, > > >>> rather than just an abstraction - Machine Learning in a plug-and-play > > >>> fashion. > > >>> > > >>> Compared to Apache Mahout, which focuses on the development of a wide > > >>> variety of algorithms, PredictionIO offers a platform to manage the > > whole > > >>> machine learning workflow, including data collection, data > preparation, > > >>> modeling, deployment and management of predictive services in > > production > > >>> environments. > > >>> > > >>> ==== An Excessive Fascination with the Apache Brand ==== > > >>> PredictionIO is already a widely known open source project. This > > proposal > > >>> is not for the purpose of generating publicity. Rather, the primary > > >>> benefits to joining Apache are those outlined in the Rationale > section. > > >>> > > >>> === Documentation === > > >>> PredictionIO boasts rich and live documentation, included in the code > > >>> repo > > >>> (docs/manual directory), is built with Middleman, and publicly hosted > > at > > >>> https://docs.prediction.io > > >>> > > >>> === Initial Source and Intellectual Property Submission Plan === > > >>> Currently, the PredictionIO codebase is distributed under the Apache > > 2.0 > > >>> License and hosted on GitHub: > > >>> https://github.com/PredictionIO/PredictionIO > > >>> > > >>> === External Dependencies === > > >>> PredictionIO has the following external dependencies: > > >>> * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are > > >>> needed) > > >>> * Apache Spark 1.3.0 for Hadoop 2.4 > > >>> * Java SE Development Kit 8 > > >>> * and one of the following sets: > > >>> > > >>> * PostgreSQL 9.1 > > >>> > > >>> > > >>> or > > >>> > > >>> > > >>> * MySQL 5.1 > > >>> > > >>> or > > >>> > > >>> > > >>> * Apache HBase 0.98.6 > > >>> > > >>> > > >>> * Elasticsearch 1.4.0 > > >>> > > >>> Upon acceptance to the incubator, we would begin a thorough analysis > of > > >>> all transitive dependencies to verify this information and introduce > > >>> license checking into the build and release process by integrating > with > > >>> Apache RAT. > > >>> > > >>> === Cryptography === > > >>> PredictionIO does not include cryptographic code. We utilize standard > > >>> JCE and JSSE APIs provided by the Java Runtime Environment. > > >>> > > >>> === Required Resources === > > >>> We request that following resources be created for the project to use > > >>> > > >>> ==== Mailing lists ==== > > >>> > > >>> predictionio-priv...@incubator.apache.org (with moderated > > subscriptions) > > >>> > > >>> predictionio-dev > > >>> > > >>> predictionio-user > > >>> > > >>> predictionio-commits > > >>> > > >>> We will migrate the existing PredictionIO mailing lists. > > >>> > > >>> ==== Git repository ==== > > >>> The PredictionIO team would like to use Git for source control, due > to > > >>> our > > >>> current use of GitHub. > > >>> > > >>> git://git.apache.org/incubator-predictionio > > >>> > > >>> ==== Documentation ==== > > >>> https://predictionio.incubator.apache.org/docs/ > > >>> > > >>> ==== JIRA instance ==== > > >>> PredictionIO currently uses the GitHub issue tracking system > associated > > >>> with its repository: > > https://github.com/PredictionIO/PredictionIO/issues > > >>> . > > >>> We will migrate to Apache JIRA. > > >>> > > >>> JIRA PREDICTIONIO > > >>> https://issues.apache.org/jira/browse/PREDICTIONIO > > >>> > > >>> ==== Other Resources ==== > > >>> * TravisCI for builds and test running. > > >>> > > >>> * PredictionIO's documentation, included in the code repo > (docs/manual > > >>> directory), is built with Middleman and publicly hosted > > >>> https://docs.prediction.io > > >>> > > >>> * A blog to drive adoption and excitement at > > https://blog.prediction.io > > >>> > > >>> === Initial Committers === > > >>> > > >>> * Pat Ferrell > > >>> > > >>> * Tamas Jambor > > >>> > > >>> * Justin Yip > > >>> > > >>> * Xusen Yin > > >>> > > >>> * Lee Moon Soo > > >>> > > >>> * Donald Szeto > > >>> > > >>> * Kenneth Chan > > >>> > > >>> * Tom Chan > > >>> > > >>> * Simon Chan > > >>> > > >>> * Marco Vivero > > >>> > > >>> * Matthew Tovbin > > >>> > > >>> * Yevgeny Khodorkovsky > > >>> > > >>> * Felipe Oliveira > > >>> > > >>> * Vitaly Gordon > > >>> > > >>> === Affiliations === > > >>> > > >>> * Pat Ferrell - ActionML > > >>> > > >>> * Tamas Jambor - Channel4 > > >>> > > >>> * Justin Yip - independent > > >>> > > >>> * Xusen Yin - USC > > >>> > > >>> * Lee Moon Soo - NFLabs > > >>> > > >>> * Donald Szeto - Salesforce > > >>> > > >>> * Kenneth Chan - Salesforce > > >>> > > >>> * Tom Chan - Salesforce > > >>> > > >>> * Simon Chan - Salesforce > > >>> > > >>> * Marco Vivero - Salesforce > > >>> > > >>> * Matthew Tovbin - Salesforce > > >>> > > >>> * Yevgeny Khodorkovsky - Salesforce > > >>> > > >>> * Felipe Oliveira - Salesforce > > >>> > > >>> * Vitaly Gordon - Salesforce > > >>> > > >>> === Sponsors === > > >>> > > >>> ==== Champion ==== > > >>> > > >>> Andrew Purtell <apurtell at apache dot org> > > >>> > > >>> ==== Nominated Mentors ==== > > >>> > > >>> * Andrew Purtell <apurtell at apache dot org> > > >>> > > >>> * James Taylor <jtaylor at apache dot org> > > >>> > > >>> * Lars Hofhansl <larsh at apache dot org> > > >>> > > >>> * Suneel Marthi <smarthi at apache dot org> > > >>> > > >>> * Xiangrui Meng <meng at apache dot org> > > >>> > > >>> * Luciano Resende <lresende at apache dot org> > > >>> > > >>> ==== Sponsoring Entity ==== > > >>> > > >>> Apache Incubator PMC > > >>> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > >> For additional commands, e-mail: general-h...@incubator.apache.org > > >> > > >> > > > -- > > > Jean-Baptiste Onofré > > > jbono...@apache.org > > > http://blog.nanthrax.net > > > Talend - http://www.talend.com > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)