[
https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903191#comment-15903191
]
Rushabh S Shah commented on HADOOP-14104:
-----------------------------------------
{quote}
final CryptoCodec codec = getCryptoCodec(conf, feInfo);
in createWrappedOutputStream(, where conf is the configuration of local
cluster. There is a possibility that the local configuration is different than
remote cluster's. So it's possible to fail here.
{quote}
We are not reading the key provider path from conf within {{getCryptoCodec}}
method. So I don't think my change (form v2 patch) will fail anything.
bq. public satic final String HADOOP_SECURITY_KEY_PROVIDER_PATH_DEFAULT = "";
Before the v2 of patch, there was no default value for
{{HADOOP_SECURITY_KEY_PROVIDER_PATH_DEFAULT}} and I kept it the same way.
I think this change is out of the scope of this jira. Suggest you to create a
new jira for this change. Let's not mix multiple things in one jira.
bq. 3. Notice that "dfs.encryption.key.provider.uri" is deprecated and replaced
with hadoop.security.key.provider.path (see HDFS-10489). So suggest to replace
variable name keyProviderUri with keyProviderPath
I didn't even know this config used to be called as
{{dfs.encryption.key.provider.uri}}.
But this {{hadoop.security.key.provider.path}} on the client side is just an
URI and not path by any way. We convert this to a path via
{{KMSClientProvider(URI uri, Configuration conf)}} where we extract the path
via {{KMSClientProvider#extractKMSPath(URI uri)}}. Thats why I named it
{{keyProviderUri}}.
But if you feel that strongly about the variable name, I can change it to
provider path in my next revision.
{quote}
4. Suggest to add two methods of package scope in DFSClient
void addKmsKeyProviderPath(...)
String getKmsKeyProviderPath(...)
{quote}
I think we use add method if that data structure is going to contain more than
one entry. I don't think the provider uri on the client side is going to
contain more than one entry.
I already have {{getKeyProviderUri}} in v2 of patch.
bq. 5.The uri used in DistributedFileSystem and DFSClient may be different, see
DistributedFileSystem#initialize below
This is very good observation. I think its safe to flip these 2 statements in
DistributedFilesystem class.
Will do it in next revision.
{noformat}
this.dfs = new DFSClient(uri, conf, statistics);
this.uri = URI.create(uri.getScheme()+"://"+uri.getAuthority());
{noformat}
bq. 6. Seems we need a similar change in WebHdfsFileSystem when calling
addDelegationTokens
I don't think key provider is used by WebHDFSFileSystem. Maybe I'm missing
something.
Can you please elaborate your comment ?
{quote}
7. About your question w.r.t. public boolean isHDFSEncryptionEnabled() throwing
StandbyException. There is a solution, that is, we need to incorporate remote's
cluster's nameservices configurations in the client (distcp for example)
configuration, and let the client handle the NN failover and retry. We need to
document this.
{quote}
I didn't understand this comment also. Please elaborate.
bq. We need a solution soon, so I hope you don't mind I just uploaded a patch
to address most of my comments.
I don't see any sense of urgency from the community since providing key
provider path via conf was the standard way since the feature was introduced
since hadoop 2.6 (Aug 2014).
Having said that I was waiting for comments from [~daryn] and [~andrew.wang]
before putting up a new patch.
I know [~daryn] was out of office for last couple of days and [~andrew.wang]
was involved in 2.8.0 release work so I thought to wait for few days before
pinging them.
[~yzhangal]: Since I am the assignee of the jira and I am updating patches
regularly, _please allow_ me to upload the patches henceforth. I would like to
get valuable review comments from you.
I hope you don't mind that.
> Client should always ask namenode for kms provider path.
> --------------------------------------------------------
>
> Key: HADOOP-14104
> URL: https://issues.apache.org/jira/browse/HADOOP-14104
> Project: Hadoop Common
> Issue Type: Improvement
> Components: kms
> Reporter: Rushabh S Shah
> Assignee: Rushabh S Shah
> Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch,
> HADOOP-14104-trunk-v2.patch, HADOOP-14104-trunk-v3.patch
>
>
> According to current implementation of kms provider in client conf, there can
> only be one kms.
> In multi-cluster environment, if a client is reading encrypted data from
> multiple clusters it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]