Thanks for the information, John!
On Wed, Jan 31, 2018 at 9:50 PM, John D. Ament <johndam...@apache.org> wrote: > Sorry for mid-posting. > > This isn't the list to determine if a project name is suitable. There's a > JIRA project dedicated to that, and if you need a quick answer better to > email trademarks@ to get a more precise answer. > > The question is really going to be, is "Apache Onyx" going to be easily > confused with something else. > > John > > On Sun, Jan 28, 2018 at 4:50 AM Byung-Gon Chun <bgc...@gmail.com> wrote: > > > Thank you for all the information! It looks like Surf doesn't work. > > > > If possible, we'd like to keep Onyx. > > Another name we came up with is Coral. > > > > Thanks! > > -Gon > > > > > > On Sun, Jan 28, 2018 at 4:21 AM, Leif Hedstrom <zw...@apache.org> wrote: > > > > > Did we rule out Onyx for sure? Just because some other project might > use > > > it on say github doesn’t necessarily exclude us from having an Apache > > Onyx? > > > > > > FWIW, I agree that surf is too similar in pronunciation to Apache serf. > > :) > > > > > > Cheers, > > > > > > — Leif > > > > > > > On Jan 27, 2018, at 07:31, Dave Fisher <dave2w...@comcast.net> > wrote: > > > > > > > > Checking “Serf Software” which sounds the same. > > > > > > > > (1) there is already Apache Serf > > > > (2) Serf is a product from Hashicorp at https://www.serf.io/. This > > > would definitely confuse as it is apparently comparable to ZooKeeper. > > > > > > > > Regards, > > > > Dave > > > > > > > > Sent from my iPhone > > > > > > > >> On Jan 27, 2018, at 3:12 AM, sebb <seb...@gmail.com> wrote: > > > >> > > > >> A brief search for 'Surf Software' shows quite a few hits. > > > >> I have not looked to see if they would be likely to be confused with > > > >> this project or cause problems for others. > > > >> > > > >> But it as though there might be a problem: > > > >> Surfer - Golden Software > > > >> surf @ sourceforge > > > >> Surf Software company > > > >> > > > >> > > > >>> On 27 January 2018 at 08:03, Byung-Gon Chun <bgc...@gmail.com> > > wrote: > > > >>> Since we cannot use the name Onyx, we would like to change the > > project > > > name > > > >>> to Surf. > > > >>> I hope that this name works. > > > >>> > > > >>> -Gon > > > >>> > > > >>> --- > > > >>> Byung-Gon Chun > > > >>> > > > >>> > > > >>>> On Sat, Jan 27, 2018 at 4:57 AM, Byung-Gon Chun <bgc...@gmail.com > > > > > wrote: > > > >>>> > > > >>>> > > > >>>> > > > >>>>> On Sat, Jan 27, 2018 at 4:09 AM, Davor Bonaci <da...@apache.org> > > > wrote: > > > >>>>> > > > >>>>> Great work -- I think this technology has a lot of promise, and > I'd > > > love > > > >>>>> to > > > >>>>> see its evolution inside the Foundation. > > > >>>>> > > > >>>>> > > > >>>> Thanks, Davor! > > > >>>> > > > >>>> > > > >>>>> Parts of it, like the Onyx Intermediate Representation [1], > overlap > > > with > > > >>>>> the work-in-progress inside the Apache Beam project > > ("portability"). > > > We'd > > > >>>>> love to work together on this -- would you be open to such > > > collaboration? > > > >>>>> If so, it may not be necessary to start from scratch, and > leverage > > > the > > > >>>>> work > > > >>>>> already done. > > > >>>>> > > > >>>>> > > > >>>> Sure. We're open to collaboration. > > > >>>> > > > >>>> > > > >>>>> Regarding the name, Onyx would likely have to be renamed, due to > a > > > >>>>> conflict > > > >>>>> with a related technology [2]. > > > >>>>> > > > >>>>> > > > >>>> Thanks for pointing it out. It's difficult to come up with a good > > > short > > > >>>> name. :) > > > >>>> Do you have any suggestion? > > > >>>> > > > >>>> Thanks! > > > >>>> -Gon > > > >>>> > > > >>>> --- > > > >>>> Byung-Gon Chun > > > >>>> > > > >>>> > > > >>>> > > > >>>>> Davor > > > >>>>> > > > >>>>> [1] https://snuspl.github.io/onyx/docs/ir/ > > > >>>>> [2] http://www.onyxplatform.org/ > > > >>>>> > > > >>>>>> On Thu, Jan 25, 2018 at 3:28 PM, Byung-Gon Chun < > bgc...@gmail.com > > > > > > wrote: > > > >>>>>> > > > >>>>>> Dear Apache Incubator Community, > > > >>>>>> > > > >>>>>> Please accept the following proposal for presentation and > > > discussion: > > > >>>>>> https://wiki.apache.org/incubator/OnyxProposal > > > >>>>>> > > > >>>>>> Onyx is a data processing system that aims to flexibly control > the > > > >>>>> runtime > > > >>>>>> behaviors of a job to adapt to varying deployment > characteristics > > > (e.g., > > > >>>>>> harnessing transient resources in datacenters, cross-datacenter > > > >>>>> deployment, > > > >>>>>> changing runtime based on job characteristics, etc.). Onyx > > provides > > > >>>>> ways to > > > >>>>>> extend the system’s capabilities and incorporate the extensions > to > > > the > > > >>>>>> flexible job execution. > > > >>>>>> Onyx translates a user program (e.g., Apache Beam, Apache Spark) > > > into an > > > >>>>>> Intermediate Representation (IR) DAG, which Onyx optimizes and > > > deploys > > > >>>>>> based on a deployment policy. > > > >>>>>> > > > >>>>>> I've attached the proposal below. > > > >>>>>> > > > >>>>>> Best regards, > > > >>>>>> Byung-Gon Chun > > > >>>>>> > > > >>>>>> = OnyxProposal = > > > >>>>>> > > > >>>>>> == Abstract == > > > >>>>>> Onyx is a data processing system for flexible employment with > > > >>>>>> different execution scenarios for various deployment > > characteristics > > > >>>>>> on clusters. > > > >>>>>> > > > >>>>>> == Proposal == > > > >>>>>> Today, there is a wide variety of data processing systems with > > > >>>>>> different designs for better performance and datacenter > > efficiency. > > > >>>>>> They include processing data on specific resource environments > and > > > >>>>>> running jobs with specific attributes. Although each system > > > >>>>>> successfully solves the problems it targets, most systems are > > > designed > > > >>>>>> in the way that runtime behaviors are built tightly inside the > > > system > > > >>>>>> core to hide the complexity of distributed computing. This makes > > it > > > >>>>>> hard for a single system to support different deployment > > > >>>>>> characteristics with different runtime behaviors without > > substantial > > > >>>>>> effort. > > > >>>>>> > > > >>>>>> Onyx is a data processing system that aims to flexibly control > the > > > >>>>>> runtime behaviors of a job to adapt to varying deployment > > > >>>>>> characteristics. Moreover, it provides a means of extending the > > > >>>>>> system’s capabilities and incorporating the extensions to the > > > flexible > > > >>>>>> job execution. > > > >>>>>> > > > >>>>>> In order to be able to easily modify runtime behaviors to adapt > to > > > >>>>>> varying deployment characteristics, Onyx exposes runtime > behaviors > > > to > > > >>>>>> be flexibly configured and modified at both compile-time and > > runtime > > > >>>>>> through a set of high-level graph pass interfaces. > > > >>>>>> > > > >>>>>> We hope to contribute to the big data processing community by > > > enabling > > > >>>>>> more flexibility and extensibility in job executions. > Furthermore, > > > we > > > >>>>>> can benefit more together as a community when we work together > as > > a > > > >>>>>> community to mature the system with more use cases and > > understanding > > > >>>>>> of diverse deployment characteristics. The Apache Software > > > Foundation > > > >>>>>> is the perfect place to achieve these aspirations. > > > >>>>>> > > > >>>>>> == Background == > > > >>>>>> Many data processing systems have distinctive runtime behaviors > > > >>>>>> optimized and configured for specific deployment characteristics > > > like > > > >>>>>> different resource environments and for handling special job > > > >>>>>> attributes. > > > >>>>>> > > > >>>>>> For example, much research have been conducted to overcome the > > > >>>>>> challenge of running data processing jobs on cheap, unreliable > > > >>>>>> transient resources. Likewise, techniques for disaggregating > > > different > > > >>>>>> types of resources, like memory, CPU and GPU, are being actively > > > >>>>>> developed to use datacenter resources more efficiently. Many > > > >>>>>> researchers are also working to run data processing jobs in even > > > more > > > >>>>>> diverse environments, such as across distant datacenters. > > Similarly, > > > >>>>>> for special job attributes, many works take different > approaches, > > > such > > > >>>>>> as runtime optimization, to solve problems like data skew, and > to > > > >>>>>> optimize systems for data processing jobs with small-scale input > > > data. > > > >>>>>> > > > >>>>>> Although each of the systems performs well with the jobs and in > > the > > > >>>>>> environments they target, they perform poorly with unconsidered > > > cases, > > > >>>>>> and do not consider supporting multiple deployment > characteristics > > > on > > > >>>>>> a single system in their designs. > > > >>>>>> > > > >>>>>> For an application writer to optimize an application to perform > > well > > > >>>>>> on a certain system engraved with its underlying behaviors, it > > > >>>>>> requires a deep understanding of the system itself, which is an > > > >>>>>> overhead that often requires a lot of time and effort. Moreover, > > > for a > > > >>>>>> developer to modify such system behaviors, it requires > > modifications > > > >>>>>> of the system core, which requires an even deeper understanding > of > > > the > > > >>>>>> system itself. > > > >>>>>> > > > >>>>>> With this background, Onyx is designed to represent all of its > > jobs > > > as > > > >>>>>> an Intermediate Representation (IR) DAG. In the Onyx compiler, > > user > > > >>>>>> applications from various programming models (ex. Apache Beam) > are > > > >>>>>> submitted, transformed to an IR DAG, and optimized/customized > for > > > the > > > >>>>>> deployment characteristics. In the IR DAG optimization phase, > the > > > DAG > > > >>>>>> is modified through a series of compiler “passes” which reshape > or > > > >>>>>> annotate the DAG with an expression of the underlying runtime > > > >>>>>> behaviors. The IR DAG is then submitted as an execution plan for > > the > > > >>>>>> Onyx runtime. The runtime includes the unmodified parts of data > > > >>>>>> processing in the backbone which is transparently integrated > with > > > >>>>>> configurable components exposed for further extension. > > > >>>>>> > > > >>>>>> == Rationale == > > > >>>>>> Onyx’s vision lies in providing means for flexibly supporting a > > wide > > > >>>>>> variety of job execution scenarios for users while facilitating > > > system > > > >>>>>> developers to extend the execution framework with various > > > >>>>>> functionalities at the same time. The capabilities of the system > > can > > > >>>>>> be extended as it grows to meet a more variety of execution > > > scenarios. > > > >>>>>> We require inputs from users and developers from diverse domains > > in > > > >>>>>> order to make it a more thriving and useful project. The Apache > > > >>>>>> Software Foundation provides the best tools and community to > > support > > > >>>>>> this vision. > > > >>>>>> > > > >>>>>> == Initial Goals == > > > >>>>>> Initial goals will be to move the existing codebase to Apache > and > > > >>>>>> integrate with the Apache development process. We further plan > to > > > >>>>>> develop our system to meet the needs for more execution > scenarios > > > for > > > >>>>>> a more variety of deployment characteristics. > > > >>>>>> > > > >>>>>> == Current Status == > > > >>>>>> Onyx codebase is currently hosted in a repository at github.com > . > > > The > > > >>>>>> current version has been developed by system developers at Seoul > > > >>>>>> National University, Viva Republica, Samsung, and LG. > > > >>>>>> > > > >>>>>> == Meritocracy == > > > >>>>>> We plan to strongly support meritocracy. We will discuss the > > > >>>>>> requirements in an open forum, and those that continuously > > > contribute > > > >>>>>> to Onyx with the passion to strengthen the system will be > invited > > as > > > >>>>>> committers. Contributors that enrich Onyx by providing various > use > > > >>>>>> cases, various implementations of the configurable components > > > >>>>>> including ideas for optimization techniques will be especially > > > >>>>>> welcome. Committers with a deep understanding of the system’s > > > >>>>>> technical aspects as a whole and its philosophy will definitely > be > > > >>>>>> voted as the PMC. We will monitor community participation so > that > > > >>>>>> privileges can be extended to those that contribute. > > > >>>>>> > > > >>>>>> == Community == > > > >>>>>> We hope to expand our contribution community by becoming an > Apache > > > >>>>>> incubator project. The contributions will come from both users > and > > > >>>>>> system developers interested in flexibility and extensibility of > > job > > > >>>>>> executions that Onyx can support. We expect users to mainly > > > contribute > > > >>>>>> to diversify the use cases and deployment characteristics, and > > > >>>>>> developers to contribute to implement them. > > > >>>>>> > > > >>>>>> == Alignment == > > > >>>>>> Apache Spark is one of many popular data processing frameworks. > > The > > > >>>>>> system is designed towards optimizing jobs using RDDs in memory > > and > > > >>>>>> many other optimizations built tightly within the framework. In > > > >>>>>> contrast to Spark, Onyx aims to provide more flexibility for job > > > >>>>>> execution in an easy manner. > > > >>>>>> > > > >>>>>> Apache Tez enables developers to build complex task DAGs with > > > control > > > >>>>>> over the control plane of job execution. In Onyx, a high-level > > > >>>>>> programming layer (ex. Apache Beam) is automatically converted > to > > a > > > >>>>>> basic IR DAG and can be converted to any IR DAG through a series > > of > > > >>>>>> easy user writable passes, that can both reshape and modify the > > > >>>>>> annotation (of execution properties) of the DAG. Moreover, Onyx > > > leaves > > > >>>>>> more parts of the job execution configurable, such as the > > scheduler > > > >>>>>> and the data plane. As opposed to providing a set of properties > > for > > > >>>>>> solid optimization, Onyx’s configurable parts can be easily > > extended > > > >>>>>> and explored by implementing the pre-defined interfaces. For > > > example, > > > >>>>>> an arbitrary intermediate data store can be added. > > > >>>>>> > > > >>>>>> Onyx currently supports Apache Beam programs and we are working > on > > > >>>>>> supporting Apache Spark programs as well. Onyx also utilizes > > Apache > > > >>>>>> REEF for container management, which allows Onyx to run in > Apache > > > YARN > > > >>>>>> and Apache Mesos clusters. If necessary, we plan to contribute > to > > > and > > > >>>>>> collaborate with these other Apache projects for the benefit of > > all. > > > >>>>>> We plan to extend such integrations with more Apache softwares. > > > Apache > > > >>>>>> software foundation already hosts many major big-data systems, > and > > > we > > > >>>>>> expect to help further growth of the big-data community by > having > > > Onyx > > > >>>>>> within the Apache foundation. > > > >>>>>> > > > >>>>>> == Known Risks == > > > >>>>>> === Orphaned Products === > > > >>>>>> The risk of the Onyx project being orphaned is minimal. There is > > > >>>>>> already plenty of work that arduously support different > deployment > > > >>>>>> characteristics, and we propose a general way to implement them > > with > > > >>>>>> flexible and extensible configuration knobs. The domain of data > > > >>>>>> processing is already of high interest, and this domain is > > expected > > > to > > > >>>>>> evolve continuously with various other purposes, such as > resource > > > >>>>>> disaggregation and using transient resources for better > datacenter > > > >>>>>> resource utilization. > > > >>>>>> > > > >>>>>> === Inexperience with Open Source === > > > >>>>>> The initial committers include PMC members and committers of > other > > > >>>>>> Apache projects. They have experience with open source projects, > > > >>>>>> starting from their incubation to the top-level. They have been > > > >>>>>> involved in the open source development process, and are > familiar > > > with > > > >>>>>> releasing code under an open source license. > > > >>>>>> > > > >>>>>> === Homogeneous Developers === > > > >>>>>> The initial set of committers is from a limited set of > > > organizations, > > > >>>>>> but we expect to attract new contributors from diverse > > organizations > > > >>>>>> and will thus grow organically once approved for incubation. Our > > > prior > > > >>>>>> experience with other open source projects will help various > > > >>>>>> contributors to actively participate in our project. > > > >>>>>> > > > >>>>>> === Reliance on Salaried Developers === > > > >>>>>> Many developers are from Seoul National University. This is not > > > >>>>> applicable. > > > >>>>>> > > > >>>>>> === Relationships with Other Apache Products === > > > >>>>>> Onyx positions itself among multiple Apache products. It runs on > > > >>>>>> Apache REEF for container management. It also utilizes many > useful > > > >>>>>> development tools including Apache Maven, Apache Log4J, and > > multiple > > > >>>>>> Apache Commons components. Onyx supports the Apache Beam > > programming > > > >>>>>> model for user applications. We are currently working on > > supporting > > > >>>>>> the Apache Spark programming APIs as well. > > > >>>>>> > > > >>>>>> === An Excessive Fascination with the Apache Brand === > > > >>>>>> We hope to make Onyx a powerful system for data processing, > > meeting > > > >>>>>> various needs for different deployment characteristics, under a > > more > > > >>>>>> variety of environments. We see the limitations of simply > putting > > > code > > > >>>>>> on GitHub, and we believe the Apache community will help the > > growth > > > of > > > >>>>>> Onyx for the project to become a positively impactful and > > innovative > > > >>>>>> open source software. We believe Onyx is a great fit for the > > Apache > > > >>>>>> Software Foundation due to the collaboration it aims to achieve > > from > > > >>>>>> the big data processing community. > > > >>>>>> > > > >>>>>> == Documentation == > > > >>>>>> The current documentation for Onyx is at > > > https://snuspl.github.io/onyx/ > > > >>>>> . > > > >>>>>> > > > >>>>>> == Initial Source == > > > >>>>>> The Onyx codebase is currently hosted at > > > https://github.com/snuspl/onyx > > > >>>>> . > > > >>>>>> > > > >>>>>> == External Dependencies == > > > >>>>>> To the best of our knowledge, all Onyx dependencies are > > distributed > > > >>>>>> under Apache compatible licenses. Upon acceptance to the > > incubator, > > > we > > > >>>>>> would begin a thorough analysis of all transitive dependencies > to > > > >>>>>> verify this fact and further introduce license checking into the > > > build > > > >>>>>> and release process. > > > >>>>>> > > > >>>>>> == Cryptography == > > > >>>>>> Not applicable. > > > >>>>>> > > > >>>>>> == Required Resources == > > > >>>>>> === Mailing Lists === > > > >>>>>> We will operate two mailing lists as follows: > > > >>>>>> * Onyx PMC discussions: priv...@onyx.incubator.apache.org > > > >>>>>> * Onyx developers: d...@onyx.incubator.apache.org > > > >>>>>> > > > >>>>>> === Git Repositories === > > > >>>>>> Upon incubation: https://github.com/apache/incubator-onyx. > > > >>>>>> After the incubation, we would like to move the existing repo > > > >>>>>> https://github.com/snuspl/onyx to the Apache infrastructure > > > >>>>>> > > > >>>>>> === Issue Tracking === > > > >>>>>> Onyx currently tracks its issues using the Github issue tracker: > > > >>>>>> https://github.com/snuspl/onyx/issues. We plan to migrate to > > Apache > > > >>>>>> JIRA. > > > >>>>>> > > > >>>>>> == Initial Committers == > > > >>>>>> * Byung-Gon Chun > > > >>>>>> * Jeongyoon Eo > > > >>>>>> * Geon-Woo Kim > > > >>>>>> * Joo Yeon Kim > > > >>>>>> * Gyewon Lee > > > >>>>>> * Jung-Gil Lee > > > >>>>>> * Sanha Lee > > > >>>>>> * Wooyeon Lee > > > >>>>>> * Yunseong Lee > > > >>>>>> * JangHo Seo > > > >>>>>> * Won Wook Song > > > >>>>>> * Taegeon Um > > > >>>>>> * Youngseok Yang > > > >>>>>> > > > >>>>>> == Affiliations == > > > >>>>>> * SNU (Seoul National University) > > > >>>>>> * Byung-Gon Chun > > > >>>>>> * Jeongyoon Eo > > > >>>>>> * Geon-Woo Kim > > > >>>>>> * Gyewon Lee > > > >>>>>> * Sanha Lee > > > >>>>>> * Wooyeon Lee > > > >>>>>> * Yunseong Lee > > > >>>>>> * JangHo Seo > > > >>>>>> * Won Wook Song > > > >>>>>> * Taegeon Um > > > >>>>>> * Youngseok Yang > > > >>>>>> > > > >>>>>> * LG > > > >>>>>> * Jung-Gil Lee > > > >>>>>> > > > >>>>>> * Samsung > > > >>>>>> * Joo Yeon Kim > > > >>>>>> > > > >>>>>> * Viva Republica > > > >>>>>> * Geon-Woo Kim > > > >>>>>> > > > >>>>>> == Sponsors == > > > >>>>>> === Champions === > > > >>>>>> Byung-Gon Chun > > > >>>>>> > > > >>>>>> === Mentors === > > > >>>>>> * Hyunsik Choi > > > >>>>>> * Byung-Gon Chun > > > >>>>>> * Markus Weimer > > > >>>>>> * Reynold Xin > > > >>>>>> > > > >>>>>> === Sponsoring Entity === > > > >>>>>> The Apache Incubator > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> -- > > > >>>>>> Byung-Gon Chun > > > >>>>>> > > > >>>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> Byung-Gon Chun > > > >>>> > > > >>> > > > >>> > > > >>> > > > >>> -- > > > >>> Byung-Gon Chun > > > >> > > > >> ------------------------------------------------------------ > --------- > > > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > > >> For additional commands, e-mail: general-h...@incubator.apache.org > > > >> > > > > > > > > > > > > ------------------------------------------------------------ > --------- > > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > > > > > > -- > > Byung-Gon Chun > > > -- Byung-Gon Chun