[ 
https://issues.apache.org/jira/browse/HADOOP-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16785731#comment-16785731
 ] 

Wei-Chiu Chuang commented on HADOOP-16119:
------------------------------------------

Hi [~hexiaoqiao] really appreciate your insights!

Regarding delegation tokens – delegation tokens are stored in zookeeper, and 
after HADOOP-14445, delegation tokens are shared among KMS instances.

Key store consistency – I am not sure how others use KMS. But within CDH, we 
have a plugin that directs the requests to a backend server "Cloudera 
KeyTrustee Server". Essentially, KMS serves as a proxy for the backend. 
Therefore consistency is guaranteed.

Cloudera KeyTrustee Server is currently a proprietary component. But it sounds 
like Cloudera will eventually become "100% open source", so that's an option 
for you. I think your proposal makes sense. I am just not sure how much work 
will it require. At Cloudera there is a team dedicated to Cloudera KeyTrustee 
Server development, so I imagine it's a non-trivial amount of work.

IMHO, I am looking forward to a good persistent+consistent key store too, if we 
can come up with a good design. In fact, I am concerned about CKTS performance 
under the said load.

 

[~anu] [~xyao] does the Sentry KMS provide a persistent+consistent key store by 
any chance?

 

Adding/removing a KMS instance requires client side change, that is correct. 
Currently that requires a cluster-wide rolling restart. I imagine we could use 
NameNode's FsServerDefaults to update that dynamically.

 

I am not clear about the HA argument. In the current design, a KMS connection 
is not "sticky", meaning that regardless of the KMS status, _each KMS request_ 
would have an equal probability to attempt to reach a dead KMS. Is that what 
you meant?

> KMS on Hadoop RPC Engine
> ------------------------
>
>                 Key: HADOOP-16119
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16119
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Jonathan Eagles
>            Assignee: Wei-Chiu Chuang
>            Priority: Major
>         Attachments: Design doc_ KMS v2.pdf
>
>
> Per discussion on common-dev and text copied here for ease of reference.
> https://lists.apache.org/thread.html/0e2eeaf07b013f17fad6d362393f53d52041828feec53dcddff04808@%3Ccommon-dev.hadoop.apache.org%3E
> {noformat}
> Thanks all for the inputs,
> To offer additional information (while Daryn is working on his stuff),
> optimizing RPC encryption opens up another possibility: migrating KMS
> service to use Hadoop RPC.
> Today's KMS uses HTTPS + REST API, much like webhdfs. It has very
> undesirable performance (a few thousand ops per second) compared to
> NameNode. Unfortunately for each NameNode namespace operation you also need
> to access KMS too.
> Migrating KMS to Hadoop RPC greatly improves its performance (if
> implemented correctly), and RPC encryption would be a prerequisite. So
> please keep that in mind when discussing the Hadoop RPC encryption
> improvements. Cloudera is very interested to help with the Hadoop RPC
> encryption project because a lot of our customers are using at-rest
> encryption, and some of them are starting to hit KMS performance limit.
> This whole "migrating KMS to Hadoop RPC" was Daryn's idea. I heard this
> idea in the meetup and I am very thrilled to see this happening because it
> is a real issue bothering some of our customers, and I suspect it is the
> right solution to address this tech debt.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to