[ 
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903191#comment-15903191
 ] 

Rushabh S Shah commented on HADOOP-14104:
-----------------------------------------

{quote}
      final CryptoCodec codec = getCryptoCodec(conf, feInfo);
 in createWrappedOutputStream(, where conf is the configuration of local 
cluster. There is a possibility that the local configuration is different than 
remote cluster's. So it's possible to fail here.
{quote}
We are not reading the key provider path from conf within {{getCryptoCodec}} 
method. So I don't think my change (form v2 patch) will fail anything.

bq. public satic final String HADOOP_SECURITY_KEY_PROVIDER_PATH_DEFAULT = "";
Before the v2 of patch, there was no default value for 
{{HADOOP_SECURITY_KEY_PROVIDER_PATH_DEFAULT}} and I kept it the same way.
I think this change is out of the scope of this jira. Suggest you to create a 
new jira for this change. Let's not mix multiple things in one jira.

bq. 3. Notice that "dfs.encryption.key.provider.uri" is deprecated and replaced 
with hadoop.security.key.provider.path (see HDFS-10489). So suggest to replace 
variable name keyProviderUri with keyProviderPath
I didn't even know this config used to be called as 
{{dfs.encryption.key.provider.uri}}. 
But this {{hadoop.security.key.provider.path}} on the client side is just an 
URI and not path by any way. We convert this to a path via 
{{KMSClientProvider(URI uri, Configuration conf)}} where we extract the path 
via {{KMSClientProvider#extractKMSPath(URI uri)}}. Thats why I named it 
{{keyProviderUri}}.
But if you feel that strongly about the variable name, I can change it to 
provider path in my next revision.

{quote}
4. Suggest to add two methods of package scope in DFSClient
 void addKmsKeyProviderPath(...)
  String getKmsKeyProviderPath(...)
{quote}
I think we use add method if that data structure is going to contain more than 
one entry. I don't think the provider uri on the client side is going to 
contain more than one entry.
I already have {{getKeyProviderUri}}  in v2 of patch.

bq. 5.The uri used in DistributedFileSystem and DFSClient may be different, see 
DistributedFileSystem#initialize below
This is very good observation. I think its safe to flip these 2 statements in 
DistributedFilesystem class.
Will do it in next revision.
{noformat}
  this.dfs = new DFSClient(uri, conf, statistics);
    this.uri = URI.create(uri.getScheme()+"://"+uri.getAuthority());
{noformat}

bq. 6. Seems we need a similar change in WebHdfsFileSystem when calling 
addDelegationTokens
I don't think key provider is used by WebHDFSFileSystem. Maybe I'm missing 
something.
Can you please elaborate your comment ?

{quote}
7. About your question w.r.t. public boolean isHDFSEncryptionEnabled() throwing 
StandbyException. There is a solution, that is, we need to incorporate remote's 
cluster's nameservices configurations in the client (distcp for example) 
configuration, and let the client handle the NN failover and retry. We need to 
document this.
{quote}
I didn't understand this comment also.  Please elaborate.

bq. We need a solution soon, so I hope you don't mind I just uploaded a patch 
to address most of my comments.
I don't see any sense of urgency from the community since providing key 
provider path via conf was the standard way since the feature was introduced 
since hadoop 2.6 (Aug 2014).
Having said that I was waiting for comments from [~daryn] and [~andrew.wang] 
before putting up a new patch.
I know [~daryn] was out of office for last couple of days and [~andrew.wang] 
was involved in 2.8.0 release work so I thought to wait for few days before 
pinging them.
[~yzhangal]: Since I am the assignee of the jira and I am updating patches 
regularly, _please allow_ me to upload the patches henceforth. I would like to 
get valuable review comments from you.
I hope you don't mind that.

> Client should always ask namenode for kms provider path.
> --------------------------------------------------------
>
>                 Key: HADOOP-14104
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14104
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: kms
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch, 
> HADOOP-14104-trunk-v2.patch, HADOOP-14104-trunk-v3.patch
>
>
> According to current implementation of kms provider in client conf, there can 
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from 
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to