[
https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208955#comment-14208955
]
Ming Ma commented on HADOOP-10597:
----------------------------------
Thanks, Chris.
The backoff retry policy is defined by interface {{ClientBackoffPolicy}}. There
are two implementations of the interface, {{NullClientBackoffPolicy}} and
{{LinearClientBackoffPolicy}}.
The experiment results are based on {{NullClientBackoffPolicy}} which doesn't
specify any retry policy. Thus RPC server will return empty
{{RetriableException}} and let client decides the retry policy. We can start
with this policy when we enable this feature in production. That will provide
us useful info and help us to improve the feature and make necessary
modification to {{ClientBackoffPolicy}} and its implementations in next
iterations.
{{LinearClientBackoffPolicy}} specifies retry policy based on numbers of
succeeded and denied requests. The policy will then be returned to the client
and the client is expected to honor that. {{recentBackOffCount}} will decrease
with each successful queued request. So in your case, if a client is denied
first and then terminates before it retries, as long as enough requests from
other clients are queued successfully, {{recentBackOffCount}} will become zero.
There shouldn't be a case where the element be queued correctly but the client
gets a retry. The warn message is there to catch bad implementation of
{{ClientBackoffPolicy}}. We can remove that as it doesn't seem to be necessary.
Yes, it is better to rename oldValue to something else.
I will provide an updated patch after rebase to address your comments.
> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
> Key: HADOOP-10597
> URL: https://issues.apache.org/jira/browse/HADOOP-10597
> Project: Hadoop Common
> Issue Type: Sub-task
> Reporter: Ming Ma
> Assignee: Ming Ma
> Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch,
> MoreRPCClientBackoffEvaluation.pdf, RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking
> state, assuming OS connection doesn't run out. Alternatively RPC or NN can
> throw some well defined exception back to the client based on certain
> policies when it is under heavy load; client will understand such exception
> and do exponential back off, as another implementation of
> RetryInvocationHandler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)