[
https://issues.apache.org/jira/browse/HADOOP-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625994#comment-15625994
]
Xiao Chen commented on HADOOP-13590:
------------------------------------
Thanks [[email protected]] for the prompt review!
Good point on {{getMaxTgtRenewalRetryCount}}, on a second thought I think it
can be eliminated, so the retry policy goes to {{Int.MAX_VALUE}} and we simply
check it against the end time. Currently it's only making sure we can create
the RetryPolicy with correct maxRetries. Will do that in the next patch, and
add comments.
bq. Test-wise, I've added support for more backoff in tests that wait; look in
LambdaTestUtils.
Thanks for the good work, let me try replace the GenericTestUtil usage with it.
bq. I also see that the code to set up a
javax.security.auth.login.Configuration is surfacing again...
See my [comment
above|https://issues.apache.org/jira/browse/HADOOP-13590?focusedCommentId=15517201&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15517201],
it's due to conflicting class name {{Configuration}} in hadoop and in javax. I
guess we'll have to explicitly define one way or the other. :(
Happy to wrap up a utility function to clean up all IBM hacks etc., I propose
to create a separate jira to limit scope of this one. Please let me know if you
feel otherwise.
> Retry until TGT expires even if the UGI renewal thread encountered exception
> ----------------------------------------------------------------------------
>
> Key: HADOOP-13590
> URL: https://issues.apache.org/jira/browse/HADOOP-13590
> Project: Hadoop Common
> Issue Type: Improvement
> Components: security
> Affects Versions: 2.8.0, 2.7.3, 2.6.4
> Reporter: Xiao Chen
> Assignee: Xiao Chen
> Attachments: HADOOP-13590.01.patch, HADOOP-13590.02.patch,
> HADOOP-13590.03.patch, HADOOP-13590.04.patch, HADOOP-13590.05.patch,
> HADOOP-13590.06.patch, HADOOP-13590.07.patch, HADOOP-13590.08.patch
>
>
> The UGI has a background thread to renew the tgt. On exception, it
> [terminates
> itself|https://github.com/apache/hadoop/blob/bee9f57f5ca9f037ade932c6fd01b0dad47a1296/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java#L1013-L1014]
> If something temporarily goes wrong that results in an IOE, even if it
> recovered no renewal will be done and client will eventually fail to
> authenticate. We should retry with our best effort, until tgt expires, in the
> hope that the error recovers before that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]