[
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rushabh S Shah updated HADOOP-14104:
------------------------------------
Attachment: HADOOP-14104-trunk-v2.patch
This patch addresses most of the previous comments.
What changed in this patch compared to previous ?
1. During job submission, added a new secret in the credential's secret map.
DFS-KMS-<namenodeUri> --> keyproviderUri
This mapping captures the namenode's keyProviderUri at the time of job
submission.
Every task that is scheduled to run will contact this key provider uri for
decrypting EDEK's.
2. The key provider uri will be searched in the following order.
- from the credentials secrets map.
- Query the namenode via server defaults.
- Local conf.
Previous concerns:
1. From [~andrew.wang]
bq. I like that getServerDefaults is lock-free, but I'm still worried about the
overhead.
The namenode#getServerDefaults will be queried only once at the time of job
submission.
2. From [~yzhangal]
{quote}
Currently getServerDefaults() contact NN every hour, to find if there is any
update of keyprovider. If keyprovider changed within the hour,
client code may get into exception, wonder if we have mechanism to handle the
exception and update the keyprovider and try again?
{quote}
This was a very good question which I didn't think while writing previous
patch. Thanks !
We glue the namenode uri to key provider uri at the time of job submission and
persist in ugi's credentials object.
The task will find it in credentials object and no longer need to contact
namenode.
If there is an update(hardware update or put in maintenance mode ) for key
provider, we plan to keep in decommission mode for 7 days.
So all the tokens which were given out while it was still active will be valid
for 7 days and then new key provider will issue tokens for newly submitted jobs.
I tried to incorporate all the previous comments in the current patch (v2) but
let me know if I missed any.
I need one suggestion.
{code:title=DFSClient.java|borderStyle=solid}
public boolean isHDFSEncryptionEnabled() {
try {
return DFSUtilClient.isHDFSEncryptionEnabled(getKeyProviderUri());
} catch (IOException ioe) {
// This means the ClientProtocol#getServerDefautls threw StandbyException
return false;
}
}
{code}
{{getKeyProviderUri}} is calling NamenodeRpcServer#getServerDefaults and it can
throw an Standby Exception in which case, I am returning false.
I don't know what is the right thing to do.
{{DFSClient.isHDFSEncryptionEnabled()}} is being called by
{{DistributedFileSystem.getTrashRoot(Path path)}} which doesn't throw any
IOException so I need to take some decision if an Exception is encountered.
Your help is much appreciated.
Please review.
> Client should always ask namenode for kms provider path.
> --------------------------------------------------------
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
> Issue Type: Improvement
> Components: kms
> Reporter: Rushabh S Shah
> Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch,
> HADOOP-14104-trunk-v2.patch
>
>
> According to current implementation of kms provider in client conf, there can
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]