Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu Sun, 21 Aug 2016 08:33:05 -0700

so in job.properties what is the jobtracker property, is it RM ip: port or
scheduler port which is 8030, if I use 8030 I am getting unknown protocol
proto buffer error.


On Aug 21, 2016 7:37 AM, "Sunil Govind" <[email protected]> wrote:

> Hi.
>
> It seems its an oozie issue. From conf, RM scheduler is running at port
> 8030.
> But your job.properties is taking 8032. I suggest you could double confirm
> your oozie configuration and see the configurations are intact to contact
> RM. Sharing a link also
> https://discuss.zendesk.com/hc/en-us/articles/203355837-
> How-to-run-a-MapReduce-jar-using-Oozie-workflow
>
> Thanks
> Sunil
>
>
> On Sun, Aug 21, 2016 at 8:41 AM rammohan ganapavarapu <
> [email protected]> wrote:
>
>> Please find the attached config that i got from yarn ui and  AM,RM logs.
>> I only see that connecting to 0.0.0.0:8030 when i submit job using
>> oozie, but if i submit as yarn jar its working fine as i posted in my
>> previous posts.
>>
>> Here is my oozie job.properties file, i have a java class that just
>> prints
>>
>> nameNode=hdfs://master01:8020
>> jobTracker=master01:8032
>> workflowName=EchoJavaJob
>> oozie.use.system.libpath=true
>>
>> queueName=default
>> hdfsWorkflowHome=/user/uap/oozieWorkflows
>>
>> workflowPath=${nameNode}${hdfsWorkflowHome}/${workflowName}
>> oozie.wf.application.path=${workflowPath}
>>
>> Please let me know if you guys find any clue why its trying to connect to
>> 0.0.0.:8030.
>>
>> Thanks,
>> Ram
>>
>>
>> On Fri, Aug 19, 2016 at 11:54 PM, Sunil Govind <[email protected]>
>> wrote:
>>
>>> Hi Ram
>>>
>>> From the console log, as Rohith said, AM is looking for AM at 8030. So
>>> pls confirm the RM port once.
>>> Could you please share AM and RM logs.
>>>
>>> Thanks
>>> Sunil
>>>
>>> On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <
>>> [email protected]> wrote:
>>>
>>>> yes, I did configured.
>>>>
>>>> On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> From below discussion and AM logs, I see that AM container has
>>>>> launched but not able to connect to RM.
>>>>>
>>>>> This looks like your configuration issue. Would you check your job.xml
>>>>> jar that does *yarn.resourcemanager.scheduler.address *has been
>>>>> configured?
>>>>>
>>>>> Essentially, this address required by MRAppMaster for connecting to RM
>>>>> for heartbeats. If you don’t not configure, default value will be taken 
>>>>> i.e
>>>>> 8030.
>>>>>
>>>>>
>>>>> Thanks & Regards
>>>>> Rohith Sharma K S
>>>>>
>>>>> On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <
>>>>> [email protected]> wrote:
>>>>>
>>>>> Even if  the cluster dont have enough resources it should connect to "
>>>>>
>>>>> /0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure 
>>>>> why its trying to connect to 0.0.0.0:8030.
>>>>>
>>>>> I have verified the config and i removed traces of 0.0.0.0 still no luck.
>>>>>
>>>>> org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at 
>>>>> /0.0.0.0:8030
>>>>>
>>>>> If an one has any clue please share.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Ram
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> When i submit a job using yarn its seems working only with oozie its
>>>>>> failing i guess, not sure what is missing.
>>>>>>
>>>>>> yarn jar 
>>>>>> /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
>>>>>> pi 20 1000
>>>>>> Number of Maps  = 20
>>>>>> Samples per Map = 1000
>>>>>> .
>>>>>> .
>>>>>> .
>>>>>> Job Finished in 19.622 seconds
>>>>>> Estimated value of Pi is 3.14280000000000000000
>>>>>>
>>>>>> Ram
>>>>>>
>>>>>> On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Ok, i have used yarn-utils.py to get the correct values for my
>>>>>>> cluster and update those properties and restarted RM and NM but still no
>>>>>>> luck not sure what i am missing, any other insights will help me.
>>>>>>>
>>>>>>> Below are my properties from yarn-site.xml and map-site.xml.
>>>>>>>
>>>>>>> python yarn-utils.py -c 24 -m 63 -d 3 -k False
>>>>>>>  Using cores=24 memory=63GB disks=3 hbase=False
>>>>>>>  Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
>>>>>>>  Num Container=6
>>>>>>>  Container Ram=10240MB
>>>>>>>  Used Ram=60GB
>>>>>>>  Unused Ram=1GB
>>>>>>>  yarn.scheduler.minimum-allocation-mb=10240
>>>>>>>  yarn.scheduler.maximum-allocation-mb=61440
>>>>>>>  yarn.nodemanager.resource.memory-mb=61440
>>>>>>>  mapreduce.map.memory.mb=5120
>>>>>>>  mapreduce.map.java.opts=-Xmx4096m
>>>>>>>  mapreduce.reduce.memory.mb=10240
>>>>>>>  mapreduce.reduce.java.opts=-Xmx8192m
>>>>>>>  yarn.app.mapreduce.am.resource.mb=5120
>>>>>>>  yarn.app.mapreduce.am.command-opts=-Xmx4096m
>>>>>>>  mapreduce.task.io.sort.mb=1024
>>>>>>>
>>>>>>>
>>>>>>>     <property>
>>>>>>>       <name>mapreduce.map.memory.mb</name>
>>>>>>>       <value>5120</value>
>>>>>>>     </property>
>>>>>>>     <property>
>>>>>>>       <name>mapreduce.map.java.opts</name>
>>>>>>>       <value>-Xmx4096m</value>
>>>>>>>     </property>
>>>>>>>     <property>
>>>>>>>       <name>mapreduce.reduce.memory.mb</name>
>>>>>>>       <value>10240</value>
>>>>>>>     </property>
>>>>>>>     <property>
>>>>>>>       <name>mapreduce.reduce.java.opts</name>
>>>>>>>       <value>-Xmx8192m</value>
>>>>>>>     </property>
>>>>>>>     <property>
>>>>>>>       <name>yarn.app.mapreduce.am.resource.mb</name>
>>>>>>>       <value>5120</value>
>>>>>>>     </property>
>>>>>>>     <property>
>>>>>>>       <name>yarn.app.mapreduce.am.command-opts</name>
>>>>>>>       <value>-Xmx4096m</value>
>>>>>>>     </property>
>>>>>>>     <property>
>>>>>>>       <name>mapreduce.task.io.sort.mb</name>
>>>>>>>       <value>1024</value>
>>>>>>>     </property>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>      <property>
>>>>>>>       <name>yarn.scheduler.minimum-allocation-mb</name>
>>>>>>>       <value>10240</value>
>>>>>>>     </property>
>>>>>>>
>>>>>>>      <property>
>>>>>>>       <name>yarn.scheduler.maximum-allocation-mb</name>
>>>>>>>       <value>61440</value>
>>>>>>>     </property>
>>>>>>>
>>>>>>>      <property>
>>>>>>>       <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>>       <value>61440</value>
>>>>>>>     </property>
>>>>>>>
>>>>>>>
>>>>>>> Ram
>>>>>>>
>>>>>>> On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[email protected]
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> maybe this link can be some reference to tune up the cluster:
>>>>>>>>
>>>>>>>> http://jason4zhu.blogspot.co.id/2014/10/memory-
>>>>>>>> configuration-in-hadoop.html
>>>>>>>>
>>>>>>>>
>>>>>>>> On 19/08/16 11:13, rammohan ganapavarapu wrote:
>>>>>>>>
>>>>>>>> Do you know what properties to tune?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Ram
>>>>>>>>
>>>>>>>> On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[email protected]
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> i think that's because you don't have enough resource.  u can tune
>>>>>>>>> your cluster config to maximize your resource.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 19/08/16 11:03, rammohan ganapavarapu wrote:
>>>>>>>>>
>>>>>>>>> I dont see any thing odd except this not sure if i have to worry
>>>>>>>>> about it or not.
>>>>>>>>>
>>>>>>>>> 2016-08-19 03:29:26,621 INFO [main] 
>>>>>>>>> org.apache.hadoop.yarn.client.RMProxy:
>>>>>>>>> Connecting to ResourceManager at /0.0.0.0:8030
>>>>>>>>> 2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client:
>>>>>>>>> Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0
>>>>>>>>> time(s); retry policy is RetryUpToMaximumCo
>>>>>>>>> untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
>>>>>>>>> 2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client:
>>>>>>>>> Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1
>>>>>>>>> time(s); retry policy is 
>>>>>>>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
>>>>>>>>> sleepTime=1000 MILLISECONDS)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> its keep printing this log ..in app container logs.
>>>>>>>>>
>>>>>>>>> On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> maybe u can check the logs from port 8088 on your browser. that
>>>>>>>>>> was RM UI. just choose your job id and then check the logs.
>>>>>>>>>>
>>>>>>>>>> On 19/08/16 10:14, rammohan ganapavarapu wrote:
>>>>>>>>>>
>>>>>>>>>> Sunil,
>>>>>>>>>>
>>>>>>>>>> Thanks you for your input, below are my server metrics for RM.
>>>>>>>>>> Also attached RM UI for capacity scheduler resources. How else i can 
>>>>>>>>>> find?
>>>>>>>>>>
>>>>>>>>>> {
>>>>>>>>>>       "name": "Hadoop:service=ResourceManager,name=
>>>>>>>>>> QueueMetrics,q0=root",
>>>>>>>>>>       "modelerType": "QueueMetrics,q0=root",
>>>>>>>>>>       "tag.Queue": "root",
>>>>>>>>>>       "tag.Context": "yarn",
>>>>>>>>>>       "tag.Hostname": "hadoop001",
>>>>>>>>>>       "running_0": 0,
>>>>>>>>>>       "running_60": 0,
>>>>>>>>>>       "running_300": 0,
>>>>>>>>>>       "running_1440": 0,
>>>>>>>>>>       "AppsSubmitted": 1,
>>>>>>>>>>       "AppsRunning": 0,
>>>>>>>>>>       "AppsPending": 0,
>>>>>>>>>>       "AppsCompleted": 0,
>>>>>>>>>>       "AppsKilled": 0,
>>>>>>>>>>       "AppsFailed": 1,
>>>>>>>>>>       "AllocatedMB": 0,
>>>>>>>>>>       "AllocatedVCores": 0,
>>>>>>>>>>       "AllocatedContainers": 0,
>>>>>>>>>>       "AggregateContainersAllocated": 2,
>>>>>>>>>>       "AggregateContainersReleased": 2,
>>>>>>>>>>       "AvailableMB": 64512,
>>>>>>>>>>       "AvailableVCores": 24,
>>>>>>>>>>       "PendingMB": 0,
>>>>>>>>>>       "PendingVCores": 0,
>>>>>>>>>>       "PendingContainers": 0,
>>>>>>>>>>       "ReservedMB": 0,
>>>>>>>>>>       "ReservedVCores": 0,
>>>>>>>>>>       "ReservedContainers": 0,
>>>>>>>>>>       "ActiveUsers": 0,
>>>>>>>>>>       "ActiveApplications": 0
>>>>>>>>>>     },
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi
>>>>>>>>>>>
>>>>>>>>>>> It could be because of many of reasons. Also I am not sure about
>>>>>>>>>>> which scheduler your are using, pls share more details such as RM 
>>>>>>>>>>> log etc.
>>>>>>>>>>>
>>>>>>>>>>> I could point out few reasons
>>>>>>>>>>>  - Such as "Not enough resource is cluster" can cause this
>>>>>>>>>>>  - If using Capacity Scheduler, if queue capacity is maxed out,
>>>>>>>>>>> such case can happen.
>>>>>>>>>>>  - Similarly if max-am-resource-percent is crossed per queue
>>>>>>>>>>> level, then also AM container may not be launched.
>>>>>>>>>>>
>>>>>>>>>>> you could check RM log to get more information if AM container
>>>>>>>>>>> is laucnhed.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Sunil
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> When i submit a MR job, i am getting this from AM UI but it
>>>>>>>>>>>> never get finished, what am i missing ?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Ram
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: [email protected]
>>>>>>>>>> For additional commands, e-mail: [email protected]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

Reply via email to