Glad to see Trafodion submitted as an Apache Incubator project.

Good luck

Venkat


On May 14, 2015, at 4:13 PM, "Birdsall, Dave" <dave.birds...@hp.com> wrote:

Hi Lars,

Glad to have you on the mentor list!

Dave

-----Original Message-----
From: lars hofhansl [mailto:la...@apache.org]
Sent: Thursday, May 14, 2015 3:39 PM
To: general@incubator.apache.org
Subject: Re: [DISCUSS] Trafodion Incubation Proposal

Bit belated... I'd be happy and honored to be a mentor for Trafodian.
-- Lars
      From: Stack <st...@duboce.net>
 To: general@incubator.apache.org
 Sent: Friday, May 8, 2015 2:59 PM
 Subject: [DISCUSS] Trafodion Incubation Proposal

I would like to start up a discussion on Trafodion joining the ASF as an 
incubating project.

Trafodion is a webscale SQL-on-Hadoop solution that enables transactional or 
operational workloads on Hadoop, .

The proposal is available on the wiki here:
https://wiki.apache.org/incubator/TrafodionProposal#preview

The proposal text is also attached to the end of this email.

Trafodion is a rich, storied SQL engine that has recently been ported to run on 
HBase and Hadoop. I think it would make for a fine addition to the Apache 
family of projects  It would be good to hear what others think.

Thank you in advance for giving the proposal a read.

Yours,
St.Ack


Trafodion Apache Incubator Proposal

Abstract

Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or 
operational workloads on Hadoop.

Proposal

Apache Trafodion builds on the scalability, elasticity, and flexibility of 
Hadoop. Trafodion extends Hadoop to provide guaranteed transactional integrity, 
enabling new kinds of big data applications to run on Hadoop. Key features of 
Apache Trafodion include:

* Full-functioned ANSI SQL language support
* JDBC/ODBC connectivity for Linux/Windows clients
* Distributed ACID transaction protection across multiple statements, tables 
and rows
* Performance improvements for OLTP workloads with compile-time and run-time 
optimizations
* Support for large data sets using a parallel-aware query optimizer
* ANSI SQL security and data integrity constraints including referential 
integrity

Hewlett-Packard Company submits this proposal to donate its Apache License, 
Version 2.0 open source project known as Trafodion, its source code, 
documentation, and web site content to the Apache Software Foundation in order 
to build an open source community

Background

Trafodion is an open source project sponsored by HP, incubated at HP Labs and 
HP-IT, to develop an enterprise-class SQL-on-Hadoop solution targeting big data 
transactional or operational workloads. HP publically announced the open source 
project and uploaded the source code to GitHub in June 2014.

The SQL compiler, optimizer and executor components of Trafodion have a rich 
heritage. Under development since 1993, they were released as commercial closed 
source software in various flavors such as HP NonStop SQL/MX and HP Neoview. 
NonStop SQL/MX was designed for online transaction processing on HP’s NonStop 
(formerly Tandem) fault-tolerant servers and is known for its high 
availability, scalability, and performance. Hundreds of companies and thousands 
of servers are running mission-critical applications today on NonStop SQL/MX. 
In addition, much of these components today are running internal to HP as the 
core of its Enterprise Data Warehouse (EDW), managing over a PB of data.

Starting in 2013, the software was modified to run on HBase and a new 
distributed transaction manager was written to run as an HBase co-processor.

Unlike most NOSQL and other SQL-on-Hadoop open source projects, Trafodion 
provides comprehensive ANSI SQL language support including full-functioned data 
definition (DDL), data manipulation (DML), transaction control (TCL) and 
database utility support.

Trafodion provides comprehensive and standard SQL data manipulation support 
including SELECT, INSERT, UPDATE, DELETE, and UPSERT/MERGE syntax with language 
options including join variants, unions, where predicates, aggregations (group 
by and having), sort ordering, sampling, correlated and nested sub-queries, 
cursors, and many SQL functions.

Utilities are provided for updating table statistics used by the optimizer for 
costing (i.e. selectivity/cardinality estimates) plan alternatives, for 
displaying the chosen SQL execution plan, plan shaping, backup and restoring 
the database, data loading and unloading, and a command line utility for 
interfacing with the database engine.

Explicit control statements are provided to allow applications to define 
transaction boundaries and to abort transactions when warranted, including 
BEGIN WORK, COMMIT WORK, ROLLBACK WORK and SET TRANSACTION.

Trafodion supports ANSI’s grant/revoke semantics to define user and role 
privileges in terms of managing and accessing the database objects.

Rationale

The name “Trafodion” (the Welsh word for transactions, pronounced
“Tra-vod-eee-on”) was chosen specifically to emphasize the differentiation that 
Trafodion provides in closing a critical gap in the Hadoop ecosystem.
Trafodion builds on the scalability, elasticity, and flexibility of Hadoop.
Trafodion extends Hadoop to provide guaranteed transactional integrity, 
enabling new kinds of big data applications to run on Hadoop.

Current Status

HP released the Trafodion code under the Apache License, Version 2, in June of 
2014. Since that time, we have had one major release in January 2015 and one 
minor release in April 2015. The focus of these releases has been in getting 
our base functionality, including security, working on top of Apache HBase, as 
well as improving performance, availability and scalability, and integrating 
better with HBase.

Meritocracy

We want to build a diverse developer community, based on the Apache Way, around 
Trafodion. To help developers become contributors, we have documentation on the 
wiki about the architecture, the source tree structure, and an example 
enhancement. We plan to publish our project backlog to the community, 
specifically highlighting areas where developers new to Trafodion may best 
start contributing, such as extending the database functionality with User 
Defined Routines (UDRs) and integrating with other Apache projects in the 
Hadoop ecosystem.

Community

We have already begun building a community but at this time the community 
consists only of Trafodion developers – all HP employees – and prospective 
users. We have participated in and hosted HBase Meetups and intend to ramp up 
our community building efforts.

The Trafodion project has seen interest in China, where HP has conducted 
proof-of-concepts with multiple companies and expects to see some of its first 
commercial deployments. To help recruit contributors and users in China, 
members of the team are translating Trafodion wiki content into Mandarin.

Core Developers

The core developers are very experienced in database and transaction monitor 
technology, with many having spent more than 20 years working in this space.

Alignment

Apache Trafodion relies on Apache HBase as its storage engine. The development 
team has collaborated with and gained valuable advice from working with the 
Apache HBase core developers. Apache Trafodion has federation capabilities as 
well, and can query Trafodion tables stored in HBase, native HBase tables, and 
Apache Hive tables.

Known Risks

Orphaned Products

HP Labs and HP-IT have been incubating Trafodion development for almost two 
years. This is part of HP’s strategy to leverage its investment in database 
software and bring software to market as open source and is similar to HP’s 
efforts with OpenStack. Trafodion builds on HP’s equity investment in the 
Hadoop ecosystem and its efforts to monetize Hadoop through hardware, software, 
and services. HP wants Trafodion to be successful, as HP will offer a 
commercially supported distribution of Trafodion.

Inexperience with Open Source

We have been working with open source software in building closed source 
software for well over two decades. To help transition to doing open source 
development, the development team received guidance and best practices from HP 
developers working on OpenStack open source projects, many of whom have 
experience working on Apache and other open source projects as well. Since 
releasing Trafodion as an open source project in June of 2014, the committers 
and contributors have moved forward using open source development processes and 
tools for bug tracking and design blueprints and Jenkins for continuous 
integration. As part of the incubation process, we recognize we may need to 
change some of our development processes/tools and conduct our discussions 
using Apache email dlists.

Homogenous Developers

Since the initial development of Trafodion has been supported by HP, all of the 
current developers are HP employees. Through the support of the Apache 
incubation project, we aim to expand the list of developers and gain 
contributors from related SQL-on-Hadoop projects and the Apache HBase project. 
Trafodion developers are experienced with distributed development processes, 
being primarily based in Palo Alto, CA; Austin, TX; and Shanghai, China. 
Trafodion is written in C++ and Java.

Reliance on Salaried Developers

Currently all of the developers working on the project are paid by their 
employer to work on the project. These developers will work on the open source 
project as well as work on the commercially supported distribution of Trafodion 
that HP will offer.

Relationship with Other Apache Products

Trafodion is built upon Apache HBase and extends it to support ACID 
transactions with HBase co-processors for distributed transaction management 
and recovery. Trafodion envisions future collaborations with the Apache HBase 
project on performance optimizations, such as in the areas of mixed workload 
support, High Availability, etc. It also provides transactional support and 
querying from native HBase tables as well.

Trafodion uses Apache Zookeeper to coordinate and manage the distribution of 
connection services across the cluster for load-balancing and high availability 
reconnection purposes in the event a Trafodion process should fail.

Trafodion also envisions working with the Apache Ambari project on enabling 
better Trafodion manageability. While Ambari focuses on system and component 
level performance metrics, Trafodion manageability will focus in a 
complimentary way on database workload monitoring and performance analytics 
with capabilities more geared towards database administrators.

There are alternative open source projects that are providing SQL-on-Hadoop 
capabilities, such as Apache Hive, Apache Drill, and Apache Phoenix. These are 
more focused on reporting and analytics across data structures supported on 
HDFS. In comparison to all of these technologies Trafodion provides a very 
complete implementation of ANSI SQL, one of the most sophisticated optimizers 
for such workloads, a completely parallel data flow architecture that does not 
materialize intermediate results unless necessary, full ACID transactional 
support, ANSI GRANT/REVOKE security, and other capabilities that would take 
decades to build in these products. On the other hand currently Trafodion is 
just focused on HBase and querying Hive, whereas Hive and Drill provide access 
to other data formats in HDFS.

An Excessive Fascination with the Apache Brand

We understand the reputation and value of the Apache brand, and no doubt 
believe that it will help us attract contributors and users. Our primary goal 
is to follow a proven, open source development and community building model 
that will make Trafodion successful and enable better collaboration with other 
Apache projects in the Hadoop ecosystem. We also understand the rules and 
guidelines about the use of the Apache brand and intend to follow them.

Documentation

Documentation and technical details on Trafodion can be found at:
http://www.trafodion.org/

Initial Source

The source is available today in a public github repository:
https://github.com/trafodion/trafodion.

Source and Intellectual Property Submission Plan

The source code has already been released under the Apache License, Version 2. 
The manuals have been released in Adobe PDF format. As part of the submission 
process, the source for the manuals will be converted from a proprietary 
DocBook XML format to AsciiDoc.

External Dependencies

Two dependencies do not have Apache compatible licenses and will be addressed 
as we enter incubation. One dependency is log4cpp, which is licensed under the 
LGPL. A compatible alternative might be Apache incubator project log4cxx. The 
other dependency is unixodbc, which is used as the ODBC driver manager. We will 
look into how Apache Hive manages being able to use this incompatible software 
and do similar. All other dependencies have Apache compatible licenses, 
including Apache 2.0, MIT/X11, MIT, and BSD.

Cryptography

Trafodion does not contain any cryptographic code. It does call cryptographic 
libraries: OpenSSL for C++ code and Java Cryptography Extension (JCE) for Java 
code.

Required Resources

Mailing Lists

priv...@trafodion.incubator.apache.org
d...@trafodion.incubator.apache.org comm...@trafodion.incubator.apache.org

Git Repository

https://git-wip-us.apache.org/repos/afs/incubator-trafodion.git

Issue Tracking

JIRA: JIRA Trafodion (Trafodion)


Initial Committers and Affiliation

Dave Birdsall, Hewlett-Packard Company, Dave.Birdsall<AT>hp<DOT>com Matt Brown, 
Hewlett-Packard Company, mattbrown<AT>hp<DOT>com Tharak Capirala, 
Hewlett-Packard Company, Tharak.Capirala<AT>hp<DOT>com Alice Chen, 
Hewlett-Packard Company, Alice.Chen<AT>hp<DOT>com John DeRoo, Hewlett-Packard 
Company, John.Deroo<AT>hp<DOT>com Roberta Marton, Hewlett-Packard Company, 
Roberta.Marton<AT>hp<DOT>com Amanda Moran, Hewlett-Packard Company, 
Amanda.Kay.Moran<AT>hp<DOT>com Suresh Subbiah, Hewlett-Packard Company, 
Suresh.Subbiah<AT>hp<DOT>com Sandyha Sundaresan, Hewlett-Packard Company, 
Sandhya.Sundaresan<AT>hp<DOT>com

Sponsors

Champion

Michael Stack, Stack<AT>apache<DOT>org

Nominated Mentors

Michael Stack, Stack<AT>apache<DOT>org
Roman Shaposhnik, rshaposhnik<AT>pivotal<DOT>io

We are seeking additional mentors.

Sponsoring Entity

Apache Incubator PMC



---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to