I thought about this a bit last night. If y'all are interested I too could also mentor the project. That should add some diversity to the mentors list. I see value in it and would like to see this community succeed.
I'm not affiliated with any company. On Mon, Feb 11, 2013 at 9:23 PM, Eric Sammer <esam...@cloudera.com> wrote: > Kevin: > > Makes complete sense. > > I'd like to offer to join the project, if it's accepted for incubation. I'm > a committer on MRUnit and Flume, and on the PMC for both. I've helped both > projects through the incubation phase, and I also know a little bit about > this Hadoop thing. ;) > > Thanks! > > > On Mon, Feb 11, 2013 at 9:28 AM, Kevin Minder > <kevin.min...@hortonworks.com>wrote: > > > Hi Eric, > > Let me answer your second question first. > > > > Q: Is it your intention to provide job submissions and data ingestion > APIs > > for MR and HDFS, respectively? > > A: Yes we plan to progress the project to cover all existing ecosystem > > projects. In addition the project is based on a modular framework that > > allows for each extension to cover services that are either new or > > proprietary. Certainly there exist very high volume data ingest use > cases > > for which using a gateway may be impractical but in general the idea is > to > > support all required client interaction with Hadoop via the gateway. > > > > Now for your first question... > > > > Q: Can you explain a bit more about what the target use case is? > > A: One typical use case will be that the gateway will run in a DMW. It > > will as you say be integrations with various directory services and is > > extensible to cover those not included. The gateway will then propagate > > the identity into the Hadoop cluster using Hadoop specific mechanisms. > The > > key point is that there will typically be a single port open on the > client > > side to the gateway. The Hadoop cluster is firewalled, only providing > > access to the Hadoop services to the gateway instances. > > A: Another use case is that an organization is already using some SSO > > solution and the gateway would be integrated with that to verify any SSO > > token and then propagate the identity to the Hadoop services. > > > > I will collect this and add it to the proposal wiki once I have privs to > > create the page. > > > > Thanks! > > Kevin. > > > > > > On 2/11/13 12:03 PM, Eric Sammer wrote: > > > >> Kevin: > >> > >> Interesting proposal. Can you explain a bit more about what the target > use > >> case is? It sounds like there's SSO-ish functionality (presumably a > doAs() > >> machine) with integration with directory services, but the proposal also > >> mentions a single point for "data and jobs." Is it your intention to > >> provide job submissions and data ingestion APIs for MR and HDFS, > >> respectively? Do you plan to target other ecosystem projects such as > >> HBase? > >> Sorry if I missed this in the proposal. > >> > >> Thanks! > >> > >> > >> On Mon, Feb 11, 2013 at 6:55 AM, Kevin Minder > >> <kevin.min...@hortonworks.com>**wrote: > >> > >> Knox Gateway Proposal > >>> > >>> == Abstract == > >>> > >>> Knox Gateway is a system that provides a single point of secure access > >>> for > >>> Apache Hadoop clusters. > >>> > >>> == Proposal == > >>> > >>> The Knox Gateway (“Gateway” or “Knox”) is a system that provides a > single > >>> point of authentication and access for Apache Hadoop services in a > >>> cluster. > >>> The goal is to simplify Hadoop security for both users (i.e. who access > >>> the > >>> cluster data and execute jobs) and operators (i.e. who control access > and > >>> manage the cluster). The Gateway runs as a server (or cluster of > servers) > >>> that serve one or more Hadoop clusters. > >>> > >>> Provide perimeter security to make Hadoop security setup easier > >>> Support authentication and token verification security scenarios > >>> Deliver users a single cluster end-point that aggregates capabilities > for > >>> data and jobs > >>> Enable integration with enterprise and cloud identity management > >>> environments > >>> > >>> == Background == > >>> > >>> An Apache Hadoop cluster is presented to consumers as a loose > collection > >>> of independent services. This makes it difficult for users to interact > >>> with > >>> Hadoop since each service maintains it’s own method of access and > >>> security. > >>> As well, for operators, configuration and administration of a secure > >>> Hadoop > >>> cluster is a complex and many Hadoop clusters are insecure as a result. > >>> > >>> == Rationale == > >>> > >>> Organizations that are struggling with Hadoop cluster security result > in > >>> a) running Hadoop without security or b) slowing adoption of Hadoop. > The > >>> Gateway aims to provide perimeter security that integrates more easily > >>> into > >>> existing organizations’ security infrastructure. Doing so will simplify > >>> security for these organizations and benefit all Hadoop stakeholders > >>> (i.e. > >>> users and operators). Additionally, making a dedicated perimeter > security > >>> project part of the Apache Hadoop ecosystem will prevent fragmentation > in > >>> this area and further increase the value of Hadoop as a data platform. > >>> > >>> == Current Status == > >>> > >>> Prototype available, developed by the list of initial committers. > >>> > >>> === Meritocracy === > >>> > >>> We desire to build a diverse developer community around Gateway > following > >>> the Apache Way. We want to make the project open source and will > >>> encourage > >>> contributors from multiple organizations following the Apache > meritocracy > >>> model. > >>> > >>> === Community === > >>> > >>> We hope to extend the user and developer base in the future and build a > >>> solid open source community around Gateway. Apache Hadoop has a large > >>> ecosystem of open source projects, each with a strong community of > >>> contributors. All project communities in this ecosystem have an > >>> opportunity > >>> to participate in the advancement of the Gateway project because > >>> ultimately, Gateway will enable the security capabilities of their > >>> project > >>> to be more enterprise friendly. > >>> > >>> === Core Developers === > >>> > >>> Gateway is currently being developed by several engineers from > >>> Hortonworks > >>> - Kevin Minder, Larry McCay, John Speidel, Tom Beerbower and Sumit > >>> Mohanty. > >>> All the engineers have deep expertise in middleware, security & > identity > >>> systems and are quite familiar with the Hadoop ecosystem. > >>> > >>> === Alignment === > >>> > >>> The ASF is a natural host for Gateway given that it is already the home > >>> of > >>> Hadoop, Hive, Pig, HBase, Oozie and other emerging big data software > >>> projects. Gateway is designed to solve the security challenges familiar > >>> to > >>> the Hadoop ecosystem family of projects. > >>> > >>> == Known Risks == > >>> > >>> === Orphaned products & Reliance on Salaried Developers === > >>> > >>> The core developers plan to work full time on the project. We believe > >>> that > >>> this project will be of general interest to many Hadoop users and will > >>> attract a diverse set of contributors. We intend to demonstrate this by > >>> having contributors from several organizations recognized as committers > >>> by > >>> the time Knox graduates from incubation. > >>> > >>> === Inexperience with Open Source === > >>> > >>> All of the core developers are active users and followers of open > source. > >>> As well, Hortonworks has a strong heritage of success with > contributions > >>> to > >>> Apache Hadoop Projects. > >>> > >>> === Homogeneous Developers === > >>> > >>> The current core developers are from Hortonworks, however, we hope to > >>> establish a developer community that includes contributors from several > >>> corporations. > >>> > >>> === Reliance on Salaried Developers === > >>> > >>> Currently, the developers are paid to do work on Gateway. However, once > >>> the project has a community built around it, we expect to get > committers > >>> and developers from outside the current core developers. > >>> > >>> === Relationships with Other Apache Products === > >>> > >>> Gateway is going to be used by the users and operators of Hadoop, and > the > >>> Hadoop ecosystem in general. > >>> > >>> === A Excessive Fascination with the Apache Brand === > >>> > >>> Our interest in developing Gateway in Apache project is to follow an > >>> established development model, as well since many of the Hadoop > ecosystem > >>> projects also are part of Apache, Gateway will complement those > projects > >>> by > >>> following the same development and contribution model. > >>> > >>> == Documentation == > >>> > >>> There is documentation in Hortonworks’ internal repositories. These can > >>> be > >>> shared upon request and will be transferred into the Apache CM system > if > >>> this proposal is accepted. > >>> > >>> == Initial Source == > >>> > >>> The source is currently in Hortonworks’ internal repositories. The > >>> process > >>> of making this GitHub repository public has been started and the URL > will > >>> be provided once available. > >>> > >>> == Source and Intellectual Property Submission Plan == > >>> > >>> The complete Gateway code is under Apache Software License 2. > >>> > >>> == External Dependencies == > >>> > >>> The Gateway dependencies are listed below, separated by Category A and > >>> Category B as defined in the Apache Third-Party Licensing Policy. Note: > >>> These are the direct dependencies. Indirect dependencies are not > >>> included. > >>> > >>> === Category A Dependencies === > >>> > >>> Apache Commons - ASLv2.0 > >>> commons-io:commons-io#2.4 > >>> commons-cli:commons-cli#1.2 > >>> commons-codec:commons-codec#1.****7 > >>> org.apache.commons:commons-****digester3#3.2 > >>> org.apache.commons:commons-****vfs2#2.0 > >>> Apache Hadoop - ASLv2.0 > >>> org.apache.hadoop:hadoop-auth#****0.23.3 > >>> org.apache.hadoop:hadoop-core#****1.0.3 > >>> Apache Geronimo - ASLv2.0 > >>> org.apache.geronimo.****components:geronimo-jaspi#2.0.****0 > >>> org.apache.geronimo.specs:****geronimo-osgi-locator#1.1 > >>> Apache Shiro - ASLv2.0 > >>> org.apache.shiro:shiro-web#1.****2.1 > >>> ApacheDS - ASLv2.0 > >>> org.apache.directory.server:****apacheds-all#1.5.5 > >>> > >>> Log4J - ASLv2.0 > >>> log4j:log4j#1.2.17 > >>> SL4J - MIT > >>> org.slf4j:slf4j-api#1.6.6 > >>> org.slf4j:slf4j-log4j12#1.6.6 > >>> Guava - ASLv2.0 > >>> com.google.guava:guava#14.0-****rc1 > >>> HttpClient - ASLv2.0 > >>> org.apache.httpcomponents:****httpclient#4.2.1 > >>> Jetty - ASLv2.0 > >>> org.eclipse.jetty:jetty-****server#8.1.7.v20120910 > >>> org.eclipse.jetty:jetty-****servlet#8.1.7.v20120910 > >>> org.eclipse.jetty:jetty-****webapp#8.1.7.v20120910 > >>> org.eclipse.jetty:jetty-jaspi#****8.1.7.v20120910 > >>> org.eclipse.jetty.aggregate:****jetty-all#8.1.7.v20120910 > >>> org.eclipse.jetty:test-jetty-****servlet#8.1.7.v20120910 > >>> Spring Security - ASLv2.0 > >>> org.springframework:spring-****core#3.1.3.RELEASE > >>> org.springframework:spring-****context#3.1.3.RELEASE > >>> org.springframework:spring-****web#3.1.3.RELEASE > >>> org.springframework.security:****spring-security-core#3.1.3.****RELEASE > >>> org.springframework.security:****spring-security-web#3.1.3.****RELEASE > >>> org.springframework.security:****spring-security-config#3.1.3.** > >>> **RELEASE > >>> org.springframework.security:****spring-security-ldap#3.1.2.****RELEASE > >>> org.springframework.ldap:****spring-ldap-core#1.3.1.RELEASE > >>> org.springframework.ldap:****spring-ldap-core-tiger#1.3.1.****RELEASE > >>> org.springframework.ldap:****spring-ldap-odm#1.3.1.RELEASE > >>> org.springframework.ldap:****spring-ldap-ldif-core#1.3.1.****RELEASE > >>> org.springframework.ldap:****spring-ldap-ldif-batch#1.3.1.****RELEASE > >>> JBoss ShrinkWrap - ASLv2.0 > >>> org.jboss.shrinkwrap:****shrinkwrap-api#1.0.1 > >>> org.jboss.shrinkwrap:****shrinkwrap-impl-base#1.0.1 > >>> org.jboss.shrinkwrap.****descriptors:shrinkwrap-** > >>> descriptors-api-javaee#2.0.0-****alpha-4 > >>> org.jboss.shrinkwrap.****descriptors:shrinkwrap-** > >>> descriptors-impl-javaee#2.0.0-****alpha-4 > >>> > >>> > >>> === Category A Dependencies (Test) === > >>> > >>> EasyMock - ASLv2.0 > >>> org.easymock:easymock#3.0 > >>> XML Matchers - ASLv2.0 > >>> org.xmlmatchers:xml-matchers#****0.10 > >>> > >>> Hamcrest - BSDv3 > >>> org.hamcrest:hamcrest-api#1.0 > >>> org.hamcrest:hamcrest-core#1.****2.1 > >>> org.hamcrest:hamcrest-library#****1.2.1 > >>> JsonPath - ASLv2.0 > >>> com.jayway.jsonpath:json-path#****0.8.1 > >>> com.jayway.jsonpath:json-path-****assert#0.8.1 > >>> > >>> XMLTool - ASLv2.0 > >>> com.mycila.xmltool:xmltool#3.3 > >>> REST-assured - ASLv2.0 > >>> com.jayway.restassured:rest-****assured#1.6.2 > >>> > >>> > >>> === Category B Dependencies === > >>> > >>> Jersey - CDDLv1.1 or GPL2wCPE > >>> com.sun.jersey:jersey-server#****1.14 > >>> com.sun.jersey:jersey-servlet#****1.14 > >>> Jerico - EPLv1.0 > >>> net.htmlparser.jericho:****jericho-html#3.2 > >>> > >>> Servlet - CDDLv1.0 or GPLv2 > >>> javax.servlet:javax.servlet-****api#3.0.1 > >>> > >>> JUnit - CPLv1.0 > >>> junit:junit#4.11 > >>> > >>> == Cryptography == > >>> > >>> The Gateway uses cryptographic software indirectly as a result of > having > >>> two dependencies: ApacheDS and Apache Shiro. Gateway does not include > any > >>> special or custom cryptographic technologies. > >>> > >>> ApacheDS is an ASF project and has been classified Export Commodity > >>> Control Number (ECCN) 5D002.C.1 due to it’s dependency on Bouncy > Castle. > >>> More information on the ApacheDS classification can be found at > >>> http://svn.apache.org/repos/****asf/directory/apacheds/trunk/****< > http://svn.apache.org/repos/**asf/directory/apacheds/trunk/**> > >>> installers/README<http://svn.**apache.org/repos/asf/** > >>> directory/apacheds/trunk/**installers/README< > http://svn.apache.org/repos/asf/directory/apacheds/trunk/installers/README > > > >>> > > >>> > >>> > >>> Apache Shiro is an ASF project and has been classified Export Commodity > >>> Control Number (ECCN) 5D002.C.1. More information on the Apache Shiro > >>> classification can be found at http://svn.apache.org/repos/** > >>> asf/shiro/trunk/README<http://**svn.apache.org/repos/asf/** > >>> shiro/trunk/README <http://svn.apache.org/repos/asf/shiro/trunk/README > >> > >>> > >>> > >>> == Required Resources == > >>> > >>> === Mailing lists === > >>> > >>> knox-dev AT incubator DOT apache DOT org > >>> knox-commits AT incubator DOT apache DOT org > >>> knox-user AT hms incubator apache DOT org > >>> knox-private AT incubator DOT apache DOT org > >>> > >>> === Subversion Directory === > >>> > >>> https://svn.apache.org/repos/****asf/incubator/knox< > https://svn.apache.org/repos/**asf/incubator/knox> > >>> <https://**svn.apache.org/repos/asf/**incubator/knox< > https://svn.apache.org/repos/asf/incubator/knox> > >>> > > >>> > >>> > >>> === Issue Tracking === > >>> > >>> JIRA Knox (KNOX) > >>> > >>> == Initial Committers == > >>> > >>> Kevin Minder (kevin DOT minder AT hortonworks DOT com) > >>> Larry McCay (lmccay AT hortonworks DOT com) > >>> John Speidel (jspeidel AT hortonworks DOT com) > >>> Tom Beerbower (tbeerbower AT hortonworks DOT com) > >>> Sumit Mohanty (smohanty AT hortonworks DOT com) > >>> > >>> == Affiliations == > >>> > >>> Kevin Minder (Hortonworks) > >>> Larry McCay (Hortonworks) > >>> John Speidel (Hortonworks) > >>> Tom Beerbower (Hortonworks) > >>> Sumit Mohanty (Hortonworks) > >>> > >>> == Sponsors == > >>> > >>> === Champion === > >>> > >>> Devaraj Das (ddas AT apache DOT org) > >>> > >>> === Nominated Mentors === > >>> > >>> Owen O’Malley (omalley AT apache DOT org) > >>> Mahadev Konar (mahadev AT apache DOT org) > >>> Alan Gates (gates AT apache DOT org) > >>> Devaraj Das (ddas AT apache DOT org) > >>> > >>> === Sponsoring Entity === > >>> > >>> Incubator PMC > >>> > >>> ------------------------------****----------------------------** > >>> --**--------- > >>> To unsubscribe, e-mail: general-unsubscribe@incubator.****apache.org< > >>> general-**unsubscribe@incubator.apache.**org< > general-unsubscr...@incubator.apache.org> > >>> > > >>> For additional commands, e-mail: general-help@incubator.apache. > ****org< > >>> general-help@incubator.**apache.org <general-h...@incubator.apache.org > >> > >>> > >>> > >>> > >> > > > > ------------------------------**------------------------------**--------- > > To unsubscribe, e-mail: general-unsubscribe@incubator.**apache.org< > general-unsubscr...@incubator.apache.org> > > For additional commands, e-mail: general-help@incubator.apache.**org< > general-h...@incubator.apache.org> > > > > > > > -- > Eric Sammer > twitter: esammer > data: www.cloudera.com > -- Best Regards, -- Alex