[
https://issues.apache.org/jira/browse/HADOOP-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384730#comment-15384730
]
Xiao Chen commented on HADOOP-13381:
------------------------------------
I had an offline discussion with [~asuresh], and here's the minute:
- Arun brought up the point that there's {{authRetry}} in KMSCP, and when
{{authToken}} is expired, a new {{DelegationTokenAuthenticatedURL.Token}} is
created and the call is retried.
This doesn't help in our case, since [(code inside the
call)|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/web/DelegationTokenAuthenticatedURL.java#L290-L296]
the UGI's credentials are used to get the kms-dt, which would be the same
expired token.
- Regarding Yarn log aggregation, I explained that MR jobs will get tokens and
run, and in the end NM will use that job's tokens to do Yarn log aggregation as
a final MR job. So this part should be done as the MR user (as opposed to NM
user: yarn), since this writes to the MR user's dir {{/tmp/logs/user/....}}. cc
[~rkanter] in case anything I said is not accurate.
- To minimize impact, we should only update {{kms-dt}} in the call.
- Arun has a general concern on updating the actualUgi's token, since normal
use case is doAs / proxy user. This could be enhanced in another jira.
(My thought after the discussion): to counter the race that multiple threads
calling the same cached KMSCP, we should create a new UGI object and update the
tokens.
Will update a patch with more details.
> KMS clients running in the same JVM should use updated KMS Delegation Token
> ---------------------------------------------------------------------------
>
> Key: HADOOP-13381
> URL: https://issues.apache.org/jira/browse/HADOOP-13381
> Project: Hadoop Common
> Issue Type: Bug
> Components: kms
> Affects Versions: 2.6.0
> Reporter: Xiao Chen
> Assignee: Xiao Chen
> Priority: Critical
> Attachments: HADOOP-13381.01.patch
>
>
> When {{/tmp}} is setup as an EZ, one may experience YARN log aggregation
> failure after the very first KMS token is expired. The MR job itself runs
> fine though.
> When this happens, YARN NodeManager's log will show
> {{AuthenticationException}} with {{token is expired}} / {{token can't be
> found in cache}}, depending on whether the expired token is removed by the
> background or not.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]