Re: [VOTE] Accept Torii into Apache Incubator

Sam Ruby Wed, 02 Dec 2015 06:21:00 -0800

On Tue, Dec 1, 2015 at 10:24 AM, Steve Loughran <ste...@hortonworks.com> wrote:
> Think I've missed the vote window, but
>
> +1 binding
>
> I will repeat what I raised when the proposal first came up, something that 
> wasn't addresses at all: ZeroMQ is LGPL, which is forbidden as a mandatory 
> dependency in ASF projects.
>
> Step 1 of the project is going to have to confirm that the zeroMQ : LGPL+ 
> Static Linking Exception is sufficient for it to be allowed as a dependency 
> on the project.


I'd like to encourage zeroMQ to move to MPL (and I'm willing to help
make that case).

Given that LGPL is essentially GPL+a static linking exception, I don't
know how LGPL+Static Linking Exception helps; the ZeroMQ licensing
page[1] suggests that it is a problem for corporate lawyers to accept;
Jim has repeatedly said in various ways that our goal is to be a
no-brainer.

> If it's not, then that's going to be a fundamental barrier to releasing Torii 
> as ASF-signed off artifacts

- Sam Ruby

[1] http://zeromq.org/area:licensing

>>> On Thu, Nov 26, 2015 at 10:33 AM, Luciano Resende <luckbr1...@gmail.com>
>>> wrote:
>>>> After initial discussion (under the name Spark-Kernel), please vote on
>>>> the
>>>> acceptance of Torii Project for incubation at the Apache Incubator. The
>>>> full proposal is
>>>> available at the end of this message and on the wiki at :
>>>>
>>>> https://wiki.apache.org/incubator/ToriiProposal
>>>>
>>>> Please cast your votes:
>>>>
>>>> [ ] +1, bring Torii into Incubator
>>>> [ ] +0, I don't care either way
>>>> [ ] -1, do not bring Torii into Incubator, because...
>>>>
>>>> Due to long weekend holiday in US, I will leave the vote open until
>>>> December 1st.
>>>>
>>>>
>>>> = Torii =
>>>>
>>>> == Abstract ==
>>>> Torii provides applications with a mechanism to interactively and
>>>> remotely
>>>> access Apache Spark.
>>>>
>>>> == Proposal ==
>>>> Torii enables interactive applications to access Apache Spark clusters.
>>>> More specifically:
>>>> * Applications can send code-snippets and libraries for execution by
>>>> Spark
>>>> * Applications can be deployed separately from Spark clusters and
>>>> communicate with the Torii using the provided Torii client
>>>> * Execution results and streaming data can be sent back to calling
>>>> applications
>>>> * Applications no longer have to be network connected to the workers
>>>> on a
>>>> Spark cluster because the Torii acts as each application’s proxy
>>>> * Work has started on enabling Torii to support languages in addition
>>>> to
>>>> Scala, namely Python (with PySpark), R (with SparkR), and SQL (with
>>>> SparkSQL)
>>>>
>>>> == Background & Rationale ==
>>>> Apache Spark provides applications with a fast and general purpose
>>>> distributed computing engine that supports static and streaming data,
>>>> tabular and graph representations of data, and an extensive library of
>>>> machine learning libraries. Consequently, a wide variety of applications
>>>> will be written for Spark and there will be interactive applications
>>>> that
>>>> require relatively frequent function evaluations, and batch-oriented
>>>> applications that require one-shot or only occasional evaluation.
>>>>
>>>> Apache Spark provides two mechanisms for applications to connect with
>>>> Spark. The primary mechanism launches applications on Spark clusters
>>>> using
>>>> spark-submit (
>>>> http://spark.apache.org/docs/latest/submitting-applications.html); this
>>>> requires developers to bundle their application code plus any
>>>> dependencies
>>>> into JAR files, and then submit them to Spark. A second mechanism is an
>>>> ODBC/JDBC API (
>>>>
>>>> http://spark.apache.org/docs/latest/sql-programming-guide.html#distribute
>>>> d-sql-engine)
>>>> which enables applications to issue SQL queries against SparkSQL.
>>>>
>>>> Our experience when developing interactive applications, such as
>>>> analytic
>>>> applications integrated with Notebooks, to run against Spark was that
>>>> the
>>>> spark-submit mechanism was overly cumbersome and slow (requiring JAR
>>>> creation and forking processes to run spark-submit), and the SQL
>>>> interface
>>>> was too limiting and did not offer easy access to components other than
>>>> SparkSQL, such as streaming. The most promising mechanism provided by
>>>> Apache Spark was the command-line shell (
>>>>
>>>> http://spark.apache.org/docs/latest/programming-guide.html#using-the-shel
>>>> l)
>>>> which enabled us to execute code snippets and dynamically control the
>>>> tasks
>>>> submitted to  a Spark cluster. Spark does not provide the command-line
>>>> shell as a consumable service but it provided us with the starting point
>>>> from which we developed Torii.
>>>>
>>>> == Current Status ==
>>>> Torii was first developed by a small team working on an internal-IBM
>>>> Spark-related project in July 2014. In recognition of its likely general
>>>> utility to Spark users and developers, in November 2014 the Torii
>>>> project
>>>> was moved to GitHub and made available under the Apache License V2.
>>>>
>>>> == Meritocracy ==
>>>> The current developers are familiar with the meritocratic open source
>>>> development process at Apache. As the project has gathered interest at
>>>> GitHub the developers have actively started a process to invite
>>>> additional
>>>> developers into the project, and we have at least one new developer who
>>>> is
>>>> ready to contribute code to the project.
>>>>
>>>> == Community ==
>>>> We started building a community around Torii project when we moved it to
>>>> GitHub about one year ago. Since then we have grown to about 70 people,
>>>> and
>>>> there are regular requests and suggestions from the community. We
>>>> believe
>>>> that providing Apache Spark application developers with a
>>>> general-purpose
>>>> and interactive API holds a lot of community potential, especially
>>>> considering possible tie-in’s with Notebooks and data science community.
>>>>
>>>> == Core Developers ==
>>>> The core developers of the project are currently all from IBM, from the
>>>> IBM
>>>> Emerging Technology team and from IBM’s recently formed Spark Technology
>>>> Center.
>>>>
>>>> == Alignment ==
>>>> Apache, as the home of Apache Spark, is the most natural home for the
>>>> Torii
>>>> project because it was designed to work with Apache Spark and to provide
>>>> capabilities for interactive applications and data science tools not
>>>> provided by Spark itself.
>>>>
>>>> The Torii also has an affinity with Jupyter (jupyter.org) because it
>>>> uses
>>>> the Jupyter protocol for communications, and so Jupyter Notebooks can
>>>> directly use the Torii as a kernel for communicating with Apache Spark.
>>>> However, we believe that the Torii provides a general-purpose mechanism
>>>> enabling a wider variety of applications than just Notebooks to access
>>>> Spark, and so the Torii’s greatest affinity is with Apache and Apache
>>>> Spark.
>>>>
>>>> == Known Risks ==
>>>>
>>>> === Orphaned products ===
>>>> We believe the Torii project has a low-risk of abandonment due to
>>>> interest
>>>> in its continuing existence from several parties. More specifically, the
>>>> Torii provides a capability that is not provided by Apache Spark today
>>>> but
>>>> it enables a wider range of applications to leverage Spark. For example,
>>>> IBM uses (and is considering) the Torii in several offerings including
>>>> its
>>>> IBM Analytics for Apache Spark product in the Bluemix Cloud. There are
>>>> also
>>>> a couple of other commercial users who are using or considering its use
>>>> in
>>>> their offerings. Furthermore, Jupyter Notebooks are used by data
>>>> scientists
>>>> and Spark is gaining popularity as an analytic engine for them. Jupyter
>>>> Notebooks are very easily enabled with the Torii and so there is another
>>>> constituency for it.
>>>>
>>>> === Inexperience with Open Source ===
>>>> The Torii project has been running as an open-source project (albeit
>>>> with
>>>> only IBM committers) for the past several months. The project has an
>>>> active
>>>> issue tracker and due to the interest indicated by the nature and
>>>> volume of
>>>> requests and comments, the team has publicly stated it is beginning to
>>>> build a process so they can accept third-party contributions to the
>>>> project.
>>>>
>>>> === Relationships with Other Apache Products ===
>>>> The Torii has a clear affinity with the Apache Spark project because it
>>>> is
>>>> designed to  provide capabilities for interactive applications and data
>>>> science tools not provided by Spark itself. The Torii can be a back-end
>>>> for
>>>> the Zeppelin project currently incubating at Apache. There is interest
>>>> from
>>>> the Torii community to develop this capability and an experimental
>>>> branch
>>>> has been started.
>>>>
>>>> === Homogeneous Developers ===
>>>> The current group of developers working on Torii are all from IBM
>>>> although
>>>> the group is in the process of expanding its membership to include
>>>> members
>>>> of the GitHub community who are not from IBM and who have been active in
>>>> the Torii community in GutHub.
>>>>
>>>> === Reliance on Salaried Developers ===
>>>> The initial committers are full-time employees at IBM although not all
>>>> work
>>>> on the project full-time.
>>>>
>>>> === Excessive Fascination with the Apache Brand ===
>>>> We believe the Torii benefits Apache Spark application developers, and
>>>> we
>>>> are interested in an Apache Torii project to benefit these developers by
>>>> engaging a larger community, facilitating closer ties with the existing
>>>> Spark project, and yes, gaining more visibility for the Torii as a
>>>> solution.
>>>>
>>>> === Documentation ===
>>>> Comprehensive documentation including “Getting Started”, API
>>>> specifications
>>>> and a Roadmap are available from the GitHub project, see
>>>> https://github.com/ibm-et/Torii/wiki.
>>>>
>>>> === Initial Source ===
>>>> The source code resides at https://github.com/ibm-et/Torii.
>>>>
>>>> === External Dependencies ===
>>>> The Torii depends upon a number of Apache projects:
>>>> * Spark
>>>> * Hadoop
>>>> * Ivy
>>>> * Commons
>>>>
>>>> The Torii also depends upon a number of other open source projects:
>>>> * ZeroMQ (LGPL with Static Linking Exception,
>>>> http://zeromq.org/area:licensing)
>>>> * Akka (MIT)
>>>> * JOpt Simple (MIT)
>>>> * Spring Framework Core (Apache v2)
>>>> * Play (Apache v2)
>>>> * SLF4J (MIT)
>>>> * Scala
>>>> * Scalatest (Apache v2)
>>>> * Scalactic (Apache v2)
>>>> * Mockito (MIT)
>>>>
>>>> == Required Resources ==
>>>>
>>>> === Mailing lists ===
>>>>
>>>> * priv...@torii.incubator.apache.org (with moderated subscriptions)
>>>> * comm...@torii.incubator.apache.org
>>>> * d...@torii.incubator.apache.org
>>>>
>>>> === Git Repository ===
>>>>
>>>> * https://git-wip-us.apache.org/repos/asf/incubator-torii.git
>>>>
>>>> === Issue Tracking ===
>>>>
>>>> * A JIRA issue tracker: https://issues.apache.org/jira/browse/TORII
>>>>
>>>> == Initial Committers ==
>>>>
>>>> * Leugim Bustelo (lbustelo AT us DOT ibm DOT com)
>>>> * Jakob Odersky (odersky AT us DOT ibm DOT com)
>>>> * Luciano Resende (lresende AT apache DOT org)
>>>> * Robert Senkbeil (rcsenkbe AT us DOT ibm DOT com)
>>>> * Corey Stubbs (cstubbs AT us DOT ibm DOT com)
>>>> * Miao Wang (wangmiao AT us DOT ibm DOT com)
>>>> * Sean Welleck (swelleck AT us DOT ibm DOT com)
>>>>
>>>> === Affiliations ===
>>>> All of the initial committers are employed by IBM.
>>>>
>>>> == Sponsors ==
>>>>
>>>> === Champion ===
>>>> * Sam Ruby (rubys AT apache DOT org)
>>>>
>>>> === Nominated Mentors ===
>>>> * Luciano Resende (lresende AT apache DOT org)
>>>> * Reynold Xin (rxin AT apache DOT org)
>>>> * Hitesh Shah (hitesh AT apache DOT org)
>>>> * Julien Le Dem (julien AT apache DOT org)
>>>>
>>>> === Sponsoring Entity ===
>>>>
>>>> We would like to propose the Apache Incubator to sponsor this project.
>>>>
>>>>
>>>> --
>>>> Luciano Resende
>>>> http://people.apache.org/~lresende
>>>> http://twitter.com/lresende1975
>>>> http://lresende.blogspot.com/
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>>> For additional commands, e-mail: general-h...@incubator.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Torii into Apache Incubator

Reply via email to