Re: Data node not able to contact the resource manager

Jeff Hubbs Mon, 05 Aug 2019 08:02:17 -0700

Does "hadoopresourcemanager" resolve to a machine that's a Hadoopresource manager? In Hadoop, it's absolutely vital that all namesresolve correctly in both directions.


On 8/5/19 10:55 AM, Daniel Santos wrote:

Hello Jon,


I have the following yarn-site.xml :

<configuration>
? ? ? ? <!-- Site specific YARN configuration properties -->
? ? ? ? <property>
<name>yarn.acl.enable</name>
? ? ? ? ? ? ? ? <value>0</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoopresourcemanager</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.nodemanager,aux-services</name>
<value>mapreduce_shuffle</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.nodemanager.resource.memory-mb</name>
? ? ? ? ? ? ? ? <value>1536</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.scheduler.maximum-allocation-mb</name>
? ? ? ? ? ? ? ? <value>1536</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.scheduler.minimum-allocation-mb</name>
? ? ? ? ? ? ? ? <value>128</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.nodemanager.vmem-check-enabled</name>
? ? ? ? ? ? ? ? <value>false</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.resourcemanager.address</name>
<value>hadoopresourcemanager:8032</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoopresourcemanager:8030</value>
? ? ? ? </property>
? ? ? ? <property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoopresourcemanager:8031</value>
? ? ? ? </property>
</configuration>

So I can say, I already tried your suggestion

Cheers

On 5 Aug 2019, at 15:22, Jon Mack <[email protected]<mailto:[email protected]>> wrote:

Looks to me it's missing the resource manager configuration based onthe port it's trying to connect to..

On Mon, Aug 5, 2019 at 9:15 AM Daniel Santos <[email protected]<mailto:[email protected]>> wrote:


    Hello,

    I have a cluster with one machine holding the name nodes (primary
    and secondary) a yarn node (resource manager) and four data nodes.
    I am running hadoop 2.7.0.

    When I submit a job to the cluster I can see it in the scheduler
    webpage. If I go to the container page and check the logs, in the
    syslog file i have in the end the following :

    2019-08-05 14:58:05,962 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server:0.0.0.0/0.0.0.0:8030  <http://0.0.0.0/0.0.0.0:8030>. Already 
tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
    2019-08-05 14:58:06,962 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server:0.0.0.0/0.0.0.0:8030  <http://0.0.0.0/0.0.0.0:8030>. Already 
tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
    2019-08-05 14:58:07,963 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server:0.0.0.0/0.0.0.0:8030  <http://0.0.0.0/0.0.0.0:8030>. Already 
tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
    2019-08-05 14:58:08,965 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server:0.0.0.0/0.0.0.0:8030  <http://0.0.0.0/0.0.0.0:8030>. Already 
tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
    2019-08-05 14:58:09,966 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server:0.0.0.0/0.0.0.0:8030  <http://0.0.0.0/0.0.0.0:8030>. Already 
tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
    2019-08-05 14:58:10,967 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server:0.0.0.0/0.0.0.0:8030  <http://0.0.0.0/0.0.0.0:8030>. Already 
tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
    2019-08-05 14:58:11,968 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server:0.0.0.0/0.0.0.0:8030  <http://0.0.0.0/0.0.0.0:8030>. Already 
tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
    2019-08-05 14:58:12,969 INFO [main] org.apache.hadoop.ipc.Client: Retrying 
connect to server:0.0.0.0/0.0.0.0:8030  <http://0.0.0.0/0.0.0.0:8030>. Already 
tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)


    I have checked the configuration of the resource manager and the
    data node where the application is running on and the property :
    ?yarn.resourcemanager.hostname that I have set in yarn-site.xml
    is shown.
    I have disabled ipv6 on the yarn machine, as some posts on the
    internet suggested. All the configuration files are the same in
    every node of the cluster.

    still I am getting these errors, and the application ends with a
    timeout.

    What am I doing wrong ?

    Thanks
    Regards

Re: Data node not able to contact the resource manager

Reply via email to