[jira] [Commented] (HADOOP-7488) When Namenode network is unplugged, DFSClient operations waits for ever

Konstantin Shvachko (JIRA) Sat, 06 Aug 2011 17:25:54 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-7488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080491#comment-13080491
 ]


Konstantin Shvachko commented on HADOOP-7488:
---------------------------------------------

If {{rpcTimeout > 0}} then {{ handleTimeout()}} will throw 
{{SocketTimeoutException}} instead of going into ping loop. Can you control the 
required behavior by setting {{rpcTimeout > 0}} rather introducing the # of 
pings limit.

DataNodes and TaskTrackers are designed to ping NN and JT infinitely, because 
during startup you cannot predict when NN will come online as it depends on the 
size of the image and edits. Also when NN becomes busy it is important for DNs 
to keep retrying rather than assuming the NN is dead.

For DFSClient this may make sense, but I think they already timeout. At list 
DFSShell ls does. And even if they don't this should be an HDFS change not 
generic IPC change, which affects many Hadoop components.
 
As for HA I don't know what you did for HA and therefore cannot understand what 
problem you are trying to solve here. I can guess that you want DNs switch to 
another NN when they timeout rather than retrying. In this case you should be 
able to use rpcTimeout.

> When Namenode network is unplugged, DFSClient operations waits for ever
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-7488
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7488
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>         Attachments: HADOOP-7488.patch
>
>
> When NN/DN is shutdown gracefully, the DFSClient operations which are waiting 
> for a response from NN/DN, will throw exception & come out quickly
> But when the NN/DN network is unplugged, the DFSClient operations which are 
> waiting for a response from NN/DN, waits for ever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7488) When Namenode network is unplugged, DFSClient operations waits for ever

Reply via email to