[jira] [Commented] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load

Ming Ma (JIRA) Wed, 12 Nov 2014 16:06:00 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208955#comment-14208955
 ]


Ming Ma commented on HADOOP-10597:
----------------------------------

Thanks, Chris.

The backoff retry policy is defined by interface {{ClientBackoffPolicy}}. There 
are two implementations of the interface, {{NullClientBackoffPolicy}} and 
{{LinearClientBackoffPolicy}}.

The experiment results are based on {{NullClientBackoffPolicy}} which doesn't 
specify any retry policy. Thus RPC server will return empty 
{{RetriableException}} and let client decides the retry policy. We can start 
with this policy when we enable this feature in production. That will provide 
us useful info and help us to improve the feature and make necessary 
modification to {{ClientBackoffPolicy}} and its implementations in next 
iterations.

{{LinearClientBackoffPolicy}} specifies retry policy based on numbers of 
succeeded and denied requests. The policy will then be returned to the client 
and the client is expected to honor that. {{recentBackOffCount}} will decrease 
with each successful queued request. So in your case, if a client is denied 
first and then terminates before it retries, as long as enough requests from 
other clients are queued successfully, {{recentBackOffCount}} will become zero.

There shouldn't be a case where the element be queued correctly but the client 
gets a retry. The warn message is there to catch bad implementation of 
{{ClientBackoffPolicy}}. We can remove that as it doesn't seem to be necessary.

Yes, it is better to rename oldValue to something else.

I will provide an updated patch after rebase to address your comments.

> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10597
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch, 
> MoreRPCClientBackoffEvaluation.pdf, RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking 
> state, assuming OS connection doesn't run out. Alternatively RPC or NN can 
> throw some well defined exception back to the client based on certain 
> policies when it is under heavy load; client will understand such exception 
> and do exponential back off, as another implementation of 
> RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load

Reply via email to