[
https://issues.apache.org/jira/browse/HADOOP-7488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080491#comment-13080491
]
Konstantin Shvachko commented on HADOOP-7488:
---------------------------------------------
If {{rpcTimeout > 0}} then {{ handleTimeout()}} will throw
{{SocketTimeoutException}} instead of going into ping loop. Can you control the
required behavior by setting {{rpcTimeout > 0}} rather introducing the # of
pings limit.
DataNodes and TaskTrackers are designed to ping NN and JT infinitely, because
during startup you cannot predict when NN will come online as it depends on the
size of the image and edits. Also when NN becomes busy it is important for DNs
to keep retrying rather than assuming the NN is dead.
For DFSClient this may make sense, but I think they already timeout. At list
DFSShell ls does. And even if they don't this should be an HDFS change not
generic IPC change, which affects many Hadoop components.
As for HA I don't know what you did for HA and therefore cannot understand what
problem you are trying to solve here. I can guess that you want DNs switch to
another NN when they timeout rather than retrying. In this case you should be
able to use rpcTimeout.
> When Namenode network is unplugged, DFSClient operations waits for ever
> -----------------------------------------------------------------------
>
> Key: HADOOP-7488
> URL: https://issues.apache.org/jira/browse/HADOOP-7488
> Project: Hadoop Common
> Issue Type: Bug
> Components: ipc
> Reporter: Uma Maheswara Rao G
> Assignee: Uma Maheswara Rao G
> Attachments: HADOOP-7488.patch
>
>
> When NN/DN is shutdown gracefully, the DFSClient operations which are waiting
> for a response from NN/DN, will throw exception & come out quickly
> But when the NN/DN network is unplugged, the DFSClient operations which are
> waiting for a response from NN/DN, waits for ever.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira