RE: Issue with Hadoop Job History Server

Benjamin Ross Thu, 18 Aug 2016 18:24:48 -0700

Turns out we made a stupid mistake - our system was managing to mix 
configuration between an old cluster and a new cluster.  So, things are working 
now.


Thanks,
Ben
________________________________
From: Benjamin Ross
Sent: Thursday, August 18, 2016 10:05 AM
To: Rohith Sharma K S; Gao, Yunlong
Cc: [email protected]
Subject: RE: Issue with Hadoop Job History Server

Rohith,
Thanks - we're still having issues.  Can you help out with this?

How do you specify the done directory for an MR job?  The job history done dir 
is mapreduce.jobhistory.done-dir.  I specified the job one as 
mapreduce.jobtracker.jobhistory.location as per the documentation here.
https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

They're both set to the same thing.  I did a recursive ls on hadoop and it 
doesn't seem like there are any directories called "done" with recent data in 
them.  All of the data in /mr-history is old.  Here's a summary of that ls:

drwx------   - yarn          hadoop          0 2016-07-14 16:39 /ats/done
drwxr-xr-x   - yarn          hadoop          0 2016-07-14 16:39 
/ats/done/1468528507723
drwxr-xr-x   - yarn          hadoop          0 2016-07-14 16:39 
/ats/done/1468528507723/0000
drwxr-xr-x   - yarn          hadoop          0 2016-07-25 20:10 
/ats/done/1468528507723/0000/000
drwxrwxrwx   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done
drwxrwx---   - mapred        hadoop          0 2016-07-19 14:47 
/mr-history/done/2016
drwxrwx---   - mapred        hadoop          0 2016-07-19 14:47 
/mr-history/done/2016/07
drwxrwx---   - mapred        hadoop          0 2016-07-27 13:49 
/mr-history/done/2016/07/19
drwxrwxrwt   - bross         hdfs            0 2016-08-15 22:39 
/tmp/hadoop-yarn/staging/history/done_intermediate
       =========> lots of recent data in 
/tmp/hadoop-yarn/staging/history/done_intermediate

Here's our mapred-site.xml:

  <configuration>

    <property>
      <name>mapreduce.admin.map.child.java.opts</name>
      <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true 
-Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>mapreduce.admin.reduce.child.java.opts</name>
      <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true 
-Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>mapreduce.admin.user.env</name>
      
<value>LD_LIBRARY_PATH=/usr/hdp/2.3.6.0-3796/hadoop/lib/native:/usr/hdp/2.3.6.0-3796/hadoop/lib/native/Linux-amd64-64</value>
    </property>

    <property>
      <name>mapreduce.am.max-attempts</name>
      <value>2</value>
    </property>

    <property>
      <name>mapreduce.application.classpath</name>
      
<value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.3.6.0-3796/hadoop/lib/hadoop-lzo-0.6.0.2.3.6.0-3796.jar:/etc/hadoop/conf/secure</value>
    </property>

    <property>
      <name>mapreduce.application.framework.path</name>
      
<value>/hdp/apps/2.3.6.0-3796/mapreduce/mapreduce.tar.gz#mr-framework</value>
    </property>

    <property>
      <name>mapreduce.cluster.administrators</name>
      <value> hadoop</value>
    </property>

    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>

    <property>
      <name>mapreduce.job.counters.max</name>
      <value>130</value>
    </property>

    <property>
      <name>mapreduce.job.emit-timeline-data</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.job.reduce.slowstart.completedmaps</name>
      <value>0.05</value>
    </property>

    <property>
      <name>mapreduce.job.user.classpath.first</name>
      <value>true</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.address</name>
      <value>bodcdevhdp6.dev.lattice.local:10020</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.bind-host</name>
      <value>0.0.0.0</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.done-dir</name>
      <value>/mr-history/done</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.intermediate-done-dir</name>
      <value>/mr-history/tmp</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.enable</name>
      <value>true</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.store.class</name>
      
<value>org.apache.hadoop.mapreduce.v2.hs.HistoryServerLeveldbStateStoreService</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.store.leveldb.path</name>
      <value>/hadoop/mapreduce/jhs</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.webapp.address</name>
      <value>bodcdevhdp6.dev.lattice.local:19888</value>
    </property>

    <property>
      <name>mapreduce.jobtracker.jobhistory.completed.location</name>
      <value>/mr-history/done</value>
    </property>

    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4915m</value>
    </property>

    <property>
      <name>mapreduce.map.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>6144</value>
    </property>

    <property>
      <name>mapreduce.map.output.compress</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.map.sort.spill.percent</name>
      <value>0.7</value>
    </property>

    <property>
      <name>mapreduce.map.speculative</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.output.fileoutputformat.compress</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.output.fileoutputformat.compress.type</name>
      <value>BLOCK</value>
    </property>

    <property>
      <name>mapreduce.reduce.input.buffer.percent</name>
      <value>0.0</value>
    </property>

    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx9830m</value>
    </property>

    <property>
      <name>mapreduce.reduce.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>12288</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.enabled</name>
      <value>1</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.interval-ms</name>
      <value>1000</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.timeout-ms</name>
      <value>30000</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.input.buffer.percent</name>
      <value>0.7</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.merge.percent</name>
      <value>0.66</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.parallelcopies</name>
      <value>30</value>
    </property>

    <property>
      <name>mapreduce.reduce.speculative</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.shuffle.port</name>
      <value>13562</value>
    </property>

    <property>
      <name>mapreduce.task.io.sort.factor</name>
      <value>100</value>
    </property>

    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>2047</value>
    </property>

    <property>
      <name>mapreduce.task.timeout</name>
      <value>300000</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.admin-command-opts</name>
      <value>-Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4915m -Dhdp.version=${hdp.version}</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>6144</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.staging-dir</name>
      <value>/user</value>
    </property>

  </configuration>

Thanks,
Ben

________________________________
From: Rohith Sharma K S [[email protected]]
Sent: Thursday, August 18, 2016 3:17 AM
To: Gao, Yunlong
Cc: [email protected]; Benjamin Ross
Subject: Re: Issue with Hadoop Job History Server

MR jobs and JHS should have same configurations for done-dir if configured. 
Otherwise staging-dir should be same for both. Make sure both Job and JHS has 
same configurations value.

Usually what would happen is , MRApp writes job file in one location and 
HistoryServer trying to read from different location. This causes, JHS to 
display empty jobs.

Thanks & Regards
Rohith Sharma K S

On Aug 18, 2016, at 12:35 PM, Gao, Yunlong 
<[email protected]<mailto:[email protected]>> wrote:

To whom it may concern,

I am using Hadoop 2.7.1.2.3.6.0-3796, with the Hortonworks distribution of 
HDP-2.3.6.0-3796. I have a question with the Hadoop Job History sever.

After I set up everything, the resource manager/name nodes/data nodes seem to 
be running fine. But the job history server is not working correctly.  The 
issue with it is that the UI of the job history server does not show any jobs.  
And all the rest calls to the job history server do not work either. Also 
notice that there is no logs in HDFS under the directory of 
"mapreduce.jobhistory.done-dir"

I have tried with different things, including restarting the job history server 
and monitor the log -- no error/exceptions is observed. I also rename the 
/hadoop/mapreduce/jhs/mr-jhs-state for the state recovery of job history 
server, and then restart it again, but no particular error happens. I tried 
with some other random stuff that I borrowed from online blogs/documents but 
got no luck.


Any help would be very much appreciated.

Thanks,
Yunlong





Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report 
this email as spam.


This message has been scanned for malware by Websense. www.websense.com

RE: Issue with Hadoop Job History Server

Reply via email to