Hi, I'm looking for information/confirmation about cross-security model access to HDFS clusters.
The situation is that we run our own cluster and the edge nodes all use the appropriate configuration files. So far so good. However we offer a hosted service and some of our customers may/will want to access external HDFS clusters, e.g., to pull in data for processing within our environment and/or push out the results. It's possible that the external clusters will have a different security model. That is - we use Kerberos authentication internally but the external cluster does not, or vice versa. This could happen if the external site contains read-only public domain information. An example that comes to mind is the FAA flight information that we used in the Coursera cloud computing class - it's public domain information so it won't require Kerberos authentication to access but a company using the data internally may wish to use Kerberos authentication on its own cluster. I know that the FileSystem.get() method has an optional Hadoop Configuration parameter. I assume it's to pass things like the location of SSL keystores/truststores, etc. I don't know whether or not it also honors the AUTHZ/AUTHN properties. On the flip side I don't see any way to specify a different Configuration to UserGroupInformation on an instance level. I can set the global Configuration to turn on Kerberos authentication, but can't pass into a method so I can get a Kerberos-authenticated UGI on an otherwise non-authenticated system. Or vice versa. (It also seems that there's an internal latch so once you've turned on Kerberos authentication you can't turn it off - that flag will not be updated. I'm not 100% certain though since it might be an artifact of having a 'current user' that uses Kerberos authentication.) This brings us to two questions to complete due diligence: 1. Is it possible to get a SIMPLE UGI instance when UserGroupInformation is configured to support Kerberos authentication? Or vice versa? 2. Is Kerberos authentication ignored if I call a HDFS cluster configured for SIMPLE authentication / authorization false? Or does that fail? (I've think it's the latter but I can't rule out that we're doing something odd and preventing it from working.) Thanks. -- Bear Giles Sr. Java Application Engineer [email protected] Mobile: 720-749-7876 <http://www.snaplogic.com/about-us/jobs> *SnapLogic Inc | 929 Pearl St #200 | 80303 CO 80302 | USA* *SnapLogic Inc | 2 W 5th Avenue 4th Floor | San Mateo CA 94402 | USA * This message is confidential. It may also be privileged or otherwise protected by work product immunity or other legal rules. If you have received it by mistake, please let us know by e-mail reply and delete it from your system; you may not copy this message or disclose its contents to anyone. The integrity and security of this message cannot be guaranteed on the Internet.
