[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899833#comment-15899833
 ] 

Yongjun Zhang commented on HADOOP-14104:
----------------------------------------

HI [~rushabh.shah], 

Thanks again for your updated patch. Below are my comments:

1. Currently we do 
{code}
      final CryptoCodec codec = getCryptoCodec(conf, feInfo);
      KeyVersion decrypted = decryptEncryptedDataEncryptionKey(feInfo);
{code}
in {{createWrappedOutputStream(}}, where conf is the configuration of local 
cluster. There is a possibility that the local configuration is different than 
remote cluster's. So it's possible to fail here.

2. suggest to introduce
{code}
public satic final String HADOOP_SECURITY_KEY_PROVIDER_PATH_DEFAULT = "";
{code}
in parallel to HADOOP_SECURITY_KEY_PROVIDER_PATH in 
CommonConfigurationKeysPublic.java (@awang would you please confirm if it's ok 
to do so since this class is public), and use this constant at multiple places 
that current uses "".

3. Notice that "dfs.encryption.key.provider.uri" is deprecated and replaced 
with hadoop.security.key.provider.path (see HDFS-10489). So suggest to replace 
variable name keyProviderUri with keyProviderPath

4. Suggest to add two methods of package scope in DFSClient
{code}
  void addKmsKeyProviderPath(...)
  String getKmsKeyProviderPath(...)
{code}
and call them from needed places.

5.The uri used in DistributedFileSystem and DFSClient may be different, see 
DistributedFileSystem#initialize below
{code}
  public void initialize(URI uri, Configuration conf) throws IOException {
    ...
    this.dfs = new DFSClient(uri, conf, statistics);
    this.uri = URI.create(uri.getScheme()+"://"+uri.getAuthority());
    this.workingDir = getHomeDirectory();
{code}
So the update and retrieve of fs/keyProvider from the credentials may be 
mismatching, because uri.toString() is used as the key in the credentials map. 
We may just use {{(uri.getScheme()+"://"+uri.getAuthority()}} as the key and 
encapsulate this in the add/get methods I proposed in 4. 

6. Seems we need a similar change in WebHdfsFileSystem when calling 
addDelegationTokens

7. About your question w.r.t. {{public boolean isHDFSEncryptionEnabled()}} 
throwing StandbyException. There is a solution, that is, we need to incorporate 
remote's cluster's nameservices configurations in the client (distcp for 
example) configuration, and let the client handle the NN failover and retry. We 
need to document this.

Hi [~daryn] and [~andrew.wang], would you please help doing a review too? and 
see if my above comments make sense to you?

Thanks.


> Client should always ask namenode for kms provider path.
> --------------------------------------------------------
>
>                 Key: HADOOP-14104
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14104
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: kms
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch, 
> HADOOP-14104-trunk-v2.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to