Hi Eric,
Let me answer your second question first.

Q: Is it your intention to provide job submissions and data ingestion APIs for 
MR and HDFS, respectively?
A: Yes we plan to progress the project to cover all existing ecosystem 
projects.  In addition the project is based on a modular framework that allows 
for each extension to cover services that are either new or proprietary.  
Certainly there exist very high volume data ingest use cases for which using a 
gateway may be impractical but in general the idea is to support all required 
client interaction with Hadoop via the gateway.

Now for your first question...

Q: Can you explain a bit more about what the target use case is?
A: One typical use case will be that the gateway will run in a DMW.  It will as 
you say be integrations with various directory services and is extensible to 
cover those not included.  The gateway will then propagate the identity into 
the Hadoop cluster using Hadoop specific mechanisms.  The key point is that 
there will typically be a single port open on the client side to the gateway.  
The Hadoop cluster is firewalled, only providing access to the Hadoop services 
to the gateway instances.
A: Another use case is that an organization is already using some SSO solution 
and the gateway would be integrated with that to verify any SSO token and then 
propagate the identity to the Hadoop services.

I will collect this and add it to the proposal wiki once I have privs to create the page.

Thanks!
Kevin.

On 2/11/13 12:03 PM, Eric Sammer wrote:
Kevin:

Interesting proposal. Can you explain a bit more about what the target use
case is? It sounds like there's SSO-ish functionality (presumably a doAs()
machine) with integration with directory services, but the proposal also
mentions a single point for "data and jobs." Is it your intention to
provide job submissions and data ingestion APIs for MR and HDFS,
respectively? Do you plan to target other ecosystem projects such as HBase?
Sorry if I missed this in the proposal.

Thanks!


On Mon, Feb 11, 2013 at 6:55 AM, Kevin Minder
<kevin.min...@hortonworks.com>wrote:

Knox Gateway Proposal

== Abstract ==

Knox Gateway is a system that provides a single point of secure access for
Apache Hadoop clusters.

== Proposal ==

The Knox Gateway (“Gateway” or “Knox”) is a system that provides a single
point of authentication and access for Apache Hadoop services in a cluster.
The goal is to simplify Hadoop security for both users (i.e. who access the
cluster data and execute jobs) and operators (i.e. who control access and
manage the cluster). The Gateway runs as a server (or cluster of servers)
that serve one or more Hadoop clusters.

Provide perimeter security to make Hadoop security setup easier
Support authentication and token verification security scenarios
Deliver users a single cluster end-point that aggregates capabilities for
data and jobs
Enable integration with enterprise and cloud identity management
environments

== Background ==

An Apache Hadoop cluster is presented to consumers as a loose collection
of independent services. This makes it difficult for users to interact with
Hadoop since each service maintains it’s own method of access and security.
As well, for operators, configuration and administration of a secure Hadoop
cluster is a complex and many Hadoop clusters are insecure as a result.

== Rationale ==

Organizations that are struggling with Hadoop cluster security result in
a) running Hadoop without security or b) slowing adoption of Hadoop. The
Gateway aims to provide perimeter security that integrates more easily into
existing organizations’ security infrastructure. Doing so will simplify
security for these organizations and benefit all Hadoop stakeholders (i.e.
users and operators). Additionally, making a dedicated perimeter security
project part of the Apache Hadoop ecosystem will prevent fragmentation in
this area and further increase the value of Hadoop as a data platform.

== Current Status ==

Prototype available, developed by the list of initial committers.

=== Meritocracy ===

We desire to build a diverse developer community around Gateway following
the Apache Way. We want to make the project open source and will encourage
contributors from multiple organizations following the Apache meritocracy
model.

=== Community ===

We hope to extend the user and developer base in the future and build a
solid open source community around Gateway. Apache Hadoop has a large
ecosystem of open source projects, each with a strong community of
contributors. All project communities in this ecosystem have an opportunity
to participate in the advancement of the Gateway project because
ultimately, Gateway will enable the security capabilities of their project
to be more enterprise friendly.

=== Core Developers ===

Gateway is currently being developed by several engineers from Hortonworks
- Kevin Minder, Larry McCay, John Speidel, Tom Beerbower and Sumit Mohanty.
All the engineers have deep expertise in middleware, security & identity
systems and are quite familiar with the Hadoop ecosystem.

=== Alignment ===

The ASF is a natural host for Gateway given that it is already the home of
Hadoop, Hive, Pig, HBase, Oozie and other emerging big data software
projects. Gateway is designed to solve the security challenges familiar to
the Hadoop ecosystem family of projects.

== Known Risks ==

=== Orphaned products & Reliance on Salaried Developers ===

The core developers plan to work full time on the project. We believe that
this project will be of general interest to many Hadoop users and will
attract a diverse set of contributors. We intend to demonstrate this by
having contributors from several organizations recognized as committers by
the time Knox graduates from incubation.

=== Inexperience with Open Source ===

All of the core developers are active users and followers of open source.
As well, Hortonworks has a strong heritage of success with contributions to
Apache Hadoop Projects.

=== Homogeneous Developers ===

The current core developers are from Hortonworks, however, we hope to
establish a developer community that includes contributors from several
corporations.

=== Reliance on Salaried Developers ===

Currently, the developers are paid to do work on Gateway. However, once
the project has a community built around it, we expect to get committers
and developers from outside the current core developers.

=== Relationships with Other Apache Products ===

Gateway is going to be used by the users and operators of Hadoop, and the
Hadoop ecosystem in general.

=== A Excessive Fascination with the Apache Brand ===

Our interest in developing Gateway in Apache project is to follow an
established development model, as well since many of the Hadoop ecosystem
projects also are part of Apache, Gateway will complement those projects by
following the same development and contribution model.

== Documentation ==

There is documentation in Hortonworks’ internal repositories. These can be
shared upon request and will be transferred into the Apache CM system if
this proposal is accepted.

== Initial Source ==

The source is currently in Hortonworks’ internal repositories. The process
of making this GitHub repository public has been started and the URL will
be provided once available.

== Source and Intellectual Property Submission Plan ==

The complete Gateway code is under Apache Software License 2.

== External Dependencies ==

The Gateway dependencies are listed below, separated by Category A and
Category B as defined in the Apache Third-Party Licensing Policy. Note:
These are the direct dependencies. Indirect dependencies are not included.

=== Category A Dependencies ===

Apache Commons - ASLv2.0
commons-io:commons-io#2.4
commons-cli:commons-cli#1.2
commons-codec:commons-codec#1.**7
org.apache.commons:commons-**digester3#3.2
org.apache.commons:commons-**vfs2#2.0
Apache Hadoop - ASLv2.0
org.apache.hadoop:hadoop-auth#**0.23.3
org.apache.hadoop:hadoop-core#**1.0.3
Apache Geronimo - ASLv2.0
org.apache.geronimo.**components:geronimo-jaspi#2.0.**0
org.apache.geronimo.specs:**geronimo-osgi-locator#1.1
Apache Shiro - ASLv2.0
org.apache.shiro:shiro-web#1.**2.1
ApacheDS - ASLv2.0
org.apache.directory.server:**apacheds-all#1.5.5
Log4J - ASLv2.0
log4j:log4j#1.2.17
SL4J - MIT
org.slf4j:slf4j-api#1.6.6
org.slf4j:slf4j-log4j12#1.6.6
Guava - ASLv2.0
com.google.guava:guava#14.0-**rc1
HttpClient - ASLv2.0
org.apache.httpcomponents:**httpclient#4.2.1
Jetty - ASLv2.0
org.eclipse.jetty:jetty-**server#8.1.7.v20120910
org.eclipse.jetty:jetty-**servlet#8.1.7.v20120910
org.eclipse.jetty:jetty-**webapp#8.1.7.v20120910
org.eclipse.jetty:jetty-jaspi#**8.1.7.v20120910
org.eclipse.jetty.aggregate:**jetty-all#8.1.7.v20120910
org.eclipse.jetty:test-jetty-**servlet#8.1.7.v20120910
Spring Security - ASLv2.0
org.springframework:spring-**core#3.1.3.RELEASE
org.springframework:spring-**context#3.1.3.RELEASE
org.springframework:spring-**web#3.1.3.RELEASE
org.springframework.security:**spring-security-core#3.1.3.**RELEASE
org.springframework.security:**spring-security-web#3.1.3.**RELEASE
org.springframework.security:**spring-security-config#3.1.3.**RELEASE
org.springframework.security:**spring-security-ldap#3.1.2.**RELEASE
org.springframework.ldap:**spring-ldap-core#1.3.1.RELEASE
org.springframework.ldap:**spring-ldap-core-tiger#1.3.1.**RELEASE
org.springframework.ldap:**spring-ldap-odm#1.3.1.RELEASE
org.springframework.ldap:**spring-ldap-ldif-core#1.3.1.**RELEASE
org.springframework.ldap:**spring-ldap-ldif-batch#1.3.1.**RELEASE
JBoss ShrinkWrap - ASLv2.0
org.jboss.shrinkwrap:**shrinkwrap-api#1.0.1
org.jboss.shrinkwrap:**shrinkwrap-impl-base#1.0.1
org.jboss.shrinkwrap.**descriptors:shrinkwrap-**
descriptors-api-javaee#2.0.0-**alpha-4
org.jboss.shrinkwrap.**descriptors:shrinkwrap-**
descriptors-impl-javaee#2.0.0-**alpha-4

=== Category A Dependencies (Test) ===

EasyMock - ASLv2.0
org.easymock:easymock#3.0
XML Matchers - ASLv2.0
org.xmlmatchers:xml-matchers#**0.10
Hamcrest - BSDv3
org.hamcrest:hamcrest-api#1.0
org.hamcrest:hamcrest-core#1.**2.1
org.hamcrest:hamcrest-library#**1.2.1
JsonPath - ASLv2.0
com.jayway.jsonpath:json-path#**0.8.1
com.jayway.jsonpath:json-path-**assert#0.8.1
XMLTool - ASLv2.0
com.mycila.xmltool:xmltool#3.3
REST-assured - ASLv2.0
com.jayway.restassured:rest-**assured#1.6.2

=== Category B Dependencies ===

Jersey - CDDLv1.1 or GPL2wCPE
com.sun.jersey:jersey-server#**1.14
com.sun.jersey:jersey-servlet#**1.14
Jerico - EPLv1.0
net.htmlparser.jericho:**jericho-html#3.2
Servlet - CDDLv1.0 or GPLv2
javax.servlet:javax.servlet-**api#3.0.1
JUnit - CPLv1.0
junit:junit#4.11

== Cryptography ==

The Gateway uses cryptographic software indirectly as a result of having
two dependencies: ApacheDS and Apache Shiro. Gateway does not include any
special or custom cryptographic technologies.

ApacheDS is an ASF project and has been classified Export Commodity
Control Number (ECCN) 5D002.C.1 due to it’s dependency on Bouncy Castle.
More information on the ApacheDS classification can be found at
http://svn.apache.org/repos/**asf/directory/apacheds/trunk/**
installers/README<http://svn.apache.org/repos/asf/directory/apacheds/trunk/installers/README>

Apache Shiro is an ASF project and has been classified Export Commodity
Control Number (ECCN) 5D002.C.1. More information on the Apache Shiro
classification can be found at http://svn.apache.org/repos/**
asf/shiro/trunk/README<http://svn.apache.org/repos/asf/shiro/trunk/README>

== Required Resources ==

=== Mailing lists ===

knox-dev AT incubator DOT apache DOT org
knox-commits AT incubator DOT apache DOT org
knox-user AT hms incubator apache DOT org
knox-private AT incubator DOT apache DOT org

=== Subversion Directory ===

https://svn.apache.org/repos/**asf/incubator/knox<https://svn.apache.org/repos/asf/incubator/knox>

=== Issue Tracking ===

JIRA Knox (KNOX)

== Initial Committers ==

Kevin Minder (kevin DOT minder AT hortonworks DOT com)
Larry McCay (lmccay AT hortonworks DOT com)
John Speidel (jspeidel AT hortonworks DOT com)
Tom Beerbower (tbeerbower AT hortonworks DOT com)
Sumit Mohanty (smohanty AT hortonworks DOT com)

== Affiliations ==

Kevin Minder (Hortonworks)
Larry McCay (Hortonworks)
John Speidel (Hortonworks)
Tom Beerbower (Hortonworks)
Sumit Mohanty (Hortonworks)

== Sponsors ==

=== Champion ===

Devaraj Das (ddas AT apache DOT org)

=== Nominated Mentors ===

Owen O’Malley (omalley AT apache DOT org)
Mahadev Konar (mahadev AT apache DOT org)
Alan Gates (gates AT apache DOT org)
Devaraj Das (ddas AT apache DOT org)

=== Sponsoring Entity ===

Incubator PMC

------------------------------**------------------------------**---------
To unsubscribe, e-mail: 
general-unsubscribe@incubator.**apache.org<general-unsubscr...@incubator.apache.org>
For additional commands, e-mail: 
general-help@incubator.apache.**org<general-h...@incubator.apache.org>





---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to