+1 (binding)

Regards
JB

On 05/24/2016 12:22 AM, Andrew Purtell wrote:
Since discussion on the matter of PredictionIO has died down, I would like
to call a VOTE
on accepting PredictionIO into the Apache Incubator.

Proposal: https://wiki.apache.org/incubator/PredictionIO

​[ ] +1 Accept PredictionIO into the Apache Incubator
[ ] +0 Abstain
[ ] -1 Do not accept PredictionIO into the Apache Incubator, because ...

This vote will be open for at least 72 hours.

My vote is +1 (binding)

--

PredictionIO Proposal

Abstract

PredictionIO is an open source Machine Learning Server built on top of
state-of-the-art open source stack, that enables developers to manage and
deploy production-ready predictive services for various kinds of machine
learning tasks.

Proposal

The PredictionIO platform consists of the following components:

    * PredictionIO framework - provides the machine learning stack for
      building, evaluating and deploying engines with machine learning
      algorithms. It uses Apache Spark for processing.

    * Event Server - the machine learning analytics layer for unifying events
      from multiple platforms. It can use Apache HBase or any JDBC backends
      as its data store.

The PredictionIO community also maintains a Template Gallery, a place to
publish and download (free or proprietary) engine templates for different
types of machine learning applications, and is a complemental part of the
project. At this point we exclude the Template Gallery from the proposal,
as it has a separate set of contributors and we’re not familiar with an
Apache approved mechanism to maintain such a gallery.

Background

PredictionIO was started with a mission to democratize and bring machine
learning to the masses.

Machine learning has traditionally been a luxury for big companies like
Google, Facebook, and Netflix. There are ML libraries and tools lying
around the internet but the effort of putting them all together as a
production-ready infrastructure is a very resource-intensive task that is
remotely reachable by individuals or small businesses.

PredictionIO is a production-ready, full stack machine learning system that
allows organizations of any scale to quickly deploy machine learning
capabilities. It comes with official and community-contributed machine
learning engine templates that are easy to customize.

Rationale

As usage and number of contributors to PredictionIO has grown bigger and
more diverse, we have sought for an independent framework for the project
to keep thriving. We believe the Apache foundation is a great fit. Joining
Apache would ensure that tried and true processes and procedures are in
place for the growing number of organizations interested in contributing
to PredictionIO. PredictionIO is also a good fit for the Apache foundation.
PredictionIO was built on top of several Apache projects (HBase, Spark,
Hadoop). We are familiar with the Apache process and believe that the
democratic and meritocratic nature of the foundation aligns with the
project goals.

Initial Goals

The initial milestones will be to move the existing codebase to Apache and
integrate with the Apache development process. Once this is accomplished,
we plan for incremental development and releases that follow the Apache
guidelines, as well as growing our developer and user communities.

Current Status

PredictionIO has undergone nine minor releases and many patches.
PredictionIO is being used in production by Salesforce.com as well as many
other organizations and apps. The PredictionIO codebase is currently
hosted at GitHub, which will form the basis of the Apache git repository.

Meritocracy

We plan to invest in supporting a meritocracy. We will discuss the
requirements in an open forum. We intend to invite additional developers
to participate. We will encourage and monitor community participation so
that privileges can be extended to those that contribute.

Community

Acceptance into the Apache foundation would bolster the already strong
user and developer community around PredictionIO. That community includes
many contributors from various other companies, and an active mailing list
composed of hundreds of users.

Core Developers

The core developers of our project are listed in our contributors and
initial PPMC below. Though many are employed at Salesforce.com, there are
also engineers from ActionML, and independent developers.

Alignment

The ASF is the natural choice to host the PredictionIO project as its goal
is democratizing Machine Learning by making it more easily accessible to
every user/developer. PredictionIO is built on top of several top level
Apache projects as outlined above.

Known Risks

Orphaned Products

PredictionIO has a solid and growing community. It is deployed on
production environments by companies of all sizes to run various kinds of
predictive engines.

In addition to the community contribution to PredictionIO framework, the
community is also actively contributing new engines to the Template
Gallery as well as SDKs and documentation for the project. Salesforce is
committed to utilize and advance the PredictionIO code base and support
its user community.

Inexperience with Open Source

PredictionIO has existed as a healthy open source project for almost two
years and is the most starred Scala project on GitHub. All of the proposed
committers have contributed to ASF and Linux Foundation open source
projects. Several current committers on Apache projects and Apache Members
are involved in this proposal and intend to provide mentorship.

Homogeneous Developers

The initial list of committers includes developers from several
institutions, including Salesforce, ActionML, Channel4, USC as well as
unaffiliated developers.

Reliance on Salaried Developers

Like most open source projects, PredictionIO receives substantial support
from salaried developers. PredictionIO development is partially supported
by Salesforce.com, but there are many contributors from various other
companies, and an active mailing list composed of hundreds of users. We
will continue our efforts to ensure stewardship of the project to be
independent of salaried developers by meritocratically promoting those
contributors to committers.

Relationships with Other Apache Product

PredictionIO relies heavily on top level Apache projects such as Apache
Spark, HBase and Hadoop. However it brings a distinguished functionality,
rather than just an abstraction - Machine Learning in a plug-and-play
fashion.

Compared to Apache Mahout, which focuses on the development of a wide
variety of algorithms, PredictionIO offers a platform to manage the whole
machine learning workflow, including data collection, data preparation,
modeling, deployment and management of predictive services in production
environments.

An Excessive Fascination with the Apache Brand

PredictionIO is already a widely known open source project. This proposal
is not for the purpose of generating publicity. Rather, the primary
benefits to joining Apache are those outlined in the Rationale section.

Documentation

PredictionIO boasts rich and live documentation, included in the code repo
(docs/manual directory), is built with Middleman, and publicly hosted at
https://docs.prediction.io

Initial Source and Intellectual Property Submission Plan

Currently, the PredictionIO codebase is distributed under the Apache 2.0
License and hosted on GitHub: https://github.com/PredictionIO/PredictionIO

External Dependencies

PredictionIO has the following external dependencies:
  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are needed)
  * Apache Spark 1.3.0 for Hadoop 2.4
  * Java SE Development Kit 8
  * and one of the following sets:
    * PostgreSQL 9.1
  or
    * MySQL 5.1
  or
    * Apache HBase 0.98.6
    * Elasticsearch 1.4.0

Upon acceptance to the incubator, we would begin a thorough analysis of
all transitive dependencies to verify this information and introduce
license checking into the build and release process by integrating with
Apache RAT.

Cryptography

PredictionIO does not include cryptographic code. We utilize standard
JCE and JSSE APIs provided by the Java Runtime Environment.

Required Resources

We request that following resources be created for the project to use

Mailing lists

   predictionio-priv...@incubator.apache.org (with moderated subscriptions)
   predictionio-dev
   predictionio-user
   predictionio-commits

   We will migrate the existing PredictionIO mailing lists.

Git repository

   The PredictionIO team would like to use Git for source control, due to our
   current use of GitHub.

   git://git.apache.org/incubator-predictionio

Documentation

   https://predictionio.incubator.apache.org/docs/

JIRA instance

   PredictionIO currently uses the GitHub issue tracking system associated
   with its repository: https://github.com/PredictionIO/PredictionIO/issues.
   We will migrate to Apache JIRA.

   JIRA PREDICTIONIO
   https://issues.apache.org/jira/browse/PREDICTIONIO

Other Resources

   TravisCI for builds and test running.

   PredictionIO's documentation, included in the code repo (docs/manual
   directory), is built with Middleman and publicly hosted at
   https://docs.prediction.io

   A blog to drive adoption and excitement at https://blog.prediction.io

Initial Committers

   Pat Ferrell
   Tamas Jambor
   Justin Yip
   Xusen Yin
   Lee Moon Soo
   Donald Szeto
   Kenneth Chan
   Tom Chan
   Simon Chan
   Marco Vivero
   Matthew Tovbin
   Yevgeny Khodorkovsky
   Felipe Oliveira
   Vitaly Gordon
   Alex Merritt

Affiliations

   Pat Ferrell - ActionML
   Tamas Jambor - Channel4
   Justin Yip - independent
   Xusen Yin - USC
   Lee Moon Soo - NFLabs
   Donald Szeto - Salesforce
   Kenneth Chan - Salesforce
   Tom Chan - Salesforce
   Simon Chan - Salesforce
   Marco Vivero - Salesforce
   Matthew Tovbin - Salesforce
   Yevgeny Khodorkovsky - Salesforce
   Felipe Oliveira - Salesforce
   Vitaly Gordon - Salesforce
   Alex Merritt - ActionML

Sponsors

Champion

   Andrew Purtell <apurtell at apache dot org>

Nominated Mentors

   Andrew Purtell <apurtell at apache dot org>
   James Taylor <jtaylor at apache dot org>
   Lars Hofhansl <larsh at apache dot org>
   Suneel Marthi <smarthi at apache dot org>
   Xiangrui Meng <meng at apache dot org>
   Luciano Resende <lresende at apache dot org>

Sponsoring Entity

   Apache Incubator PMC



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to