Hello, Can anyone please tell me why the default value of ' yarn.resourcemanager.container.liveness-monitor.interval-ms' in yarn-default.xml <https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml> is so high? This parameter determines "How often to check that containers are still alive". The default value is 60000 ms or 10 minutes. So if a node manager fails, the resource manager detects the dead container after 10 minutes.
I am running a wordcount code in my university cluster. In the middle of run, I stopped node manager of one node (the data node is still running) and found that the completion time increases about 10 minutes because of the node manager failure. Thanks in advance Tanvir >
