Turns out we made a stupid mistake - our system was managing to mix configuration between an old cluster and a new cluster. So, things are working now.
Thanks, Ben ________________________________ From: Benjamin Ross Sent: Thursday, August 18, 2016 10:05 AM To: Rohith Sharma K S; Gao, Yunlong Cc: [email protected] Subject: RE: Issue with Hadoop Job History Server Rohith, Thanks - we're still having issues. Can you help out with this? How do you specify the done directory for an MR job? The job history done dir is mapreduce.jobhistory.done-dir. I specified the job one as mapreduce.jobtracker.jobhistory.location as per the documentation here. https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml They're both set to the same thing. I did a recursive ls on hadoop and it doesn't seem like there are any directories called "done" with recent data in them. All of the data in /mr-history is old. Here's a summary of that ls: drwx------ - yarn hadoop 0 2016-07-14 16:39 /ats/done drwxr-xr-x - yarn hadoop 0 2016-07-14 16:39 /ats/done/1468528507723 drwxr-xr-x - yarn hadoop 0 2016-07-14 16:39 /ats/done/1468528507723/0000 drwxr-xr-x - yarn hadoop 0 2016-07-25 20:10 /ats/done/1468528507723/0000/000 drwxrwxrwx - mapred hadoop 0 2016-07-19 14:47 /mr-history/done drwxrwx--- - mapred hadoop 0 2016-07-19 14:47 /mr-history/done/2016 drwxrwx--- - mapred hadoop 0 2016-07-19 14:47 /mr-history/done/2016/07 drwxrwx--- - mapred hadoop 0 2016-07-27 13:49 /mr-history/done/2016/07/19 drwxrwxrwt - bross hdfs 0 2016-08-15 22:39 /tmp/hadoop-yarn/staging/history/done_intermediate =========> lots of recent data in /tmp/hadoop-yarn/staging/history/done_intermediate Here's our mapred-site.xml: <configuration> <property> <name>mapreduce.admin.map.child.java.opts</name> <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.6.0-3796</value> </property> <property> <name>mapreduce.admin.reduce.child.java.opts</name> <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.6.0-3796</value> </property> <property> <name>mapreduce.admin.user.env</name> <value>LD_LIBRARY_PATH=/usr/hdp/2.3.6.0-3796/hadoop/lib/native:/usr/hdp/2.3.6.0-3796/hadoop/lib/native/Linux-amd64-64</value> </property> <property> <name>mapreduce.am.max-attempts</name> <value>2</value> </property> <property> <name>mapreduce.application.classpath</name> <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.3.6.0-3796/hadoop/lib/hadoop-lzo-0.6.0.2.3.6.0-3796.jar:/etc/hadoop/conf/secure</value> </property> <property> <name>mapreduce.application.framework.path</name> <value>/hdp/apps/2.3.6.0-3796/mapreduce/mapreduce.tar.gz#mr-framework</value> </property> <property> <name>mapreduce.cluster.administrators</name> <value> hadoop</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.job.counters.max</name> <value>130</value> </property> <property> <name>mapreduce.job.emit-timeline-data</name> <value>false</value> </property> <property> <name>mapreduce.job.reduce.slowstart.completedmaps</name> <value>0.05</value> </property> <property> <name>mapreduce.job.user.classpath.first</name> <value>true</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>bodcdevhdp6.dev.lattice.local:10020</value> </property> <property> <name>mapreduce.jobhistory.bind-host</name> <value>0.0.0.0</value> </property> <property> <name>mapreduce.jobhistory.done-dir</name> <value>/mr-history/done</value> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>/mr-history/tmp</value> </property> <property> <name>mapreduce.jobhistory.recovery.enable</name> <value>true</value> </property> <property> <name>mapreduce.jobhistory.recovery.store.class</name> <value>org.apache.hadoop.mapreduce.v2.hs.HistoryServerLeveldbStateStoreService</value> </property> <property> <name>mapreduce.jobhistory.recovery.store.leveldb.path</name> <value>/hadoop/mapreduce/jhs</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>bodcdevhdp6.dev.lattice.local:19888</value> </property> <property> <name>mapreduce.jobtracker.jobhistory.completed.location</name> <value>/mr-history/done</value> </property> <property> <name>mapreduce.map.java.opts</name> <value>-Xmx4915m</value> </property> <property> <name>mapreduce.map.log.level</name> <value>INFO</value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>6144</value> </property> <property> <name>mapreduce.map.output.compress</name> <value>false</value> </property> <property> <name>mapreduce.map.sort.spill.percent</name> <value>0.7</value> </property> <property> <name>mapreduce.map.speculative</name> <value>false</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress</name> <value>false</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress.type</name> <value>BLOCK</value> </property> <property> <name>mapreduce.reduce.input.buffer.percent</name> <value>0.0</value> </property> <property> <name>mapreduce.reduce.java.opts</name> <value>-Xmx9830m</value> </property> <property> <name>mapreduce.reduce.log.level</name> <value>INFO</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>12288</value> </property> <property> <name>mapreduce.reduce.shuffle.fetch.retry.enabled</name> <value>1</value> </property> <property> <name>mapreduce.reduce.shuffle.fetch.retry.interval-ms</name> <value>1000</value> </property> <property> <name>mapreduce.reduce.shuffle.fetch.retry.timeout-ms</name> <value>30000</value> </property> <property> <name>mapreduce.reduce.shuffle.input.buffer.percent</name> <value>0.7</value> </property> <property> <name>mapreduce.reduce.shuffle.merge.percent</name> <value>0.66</value> </property> <property> <name>mapreduce.reduce.shuffle.parallelcopies</name> <value>30</value> </property> <property> <name>mapreduce.reduce.speculative</name> <value>false</value> </property> <property> <name>mapreduce.shuffle.port</name> <value>13562</value> </property> <property> <name>mapreduce.task.io.sort.factor</name> <value>100</value> </property> <property> <name>mapreduce.task.io.sort.mb</name> <value>2047</value> </property> <property> <name>mapreduce.task.timeout</name> <value>300000</value> </property> <property> <name>yarn.app.mapreduce.am.admin-command-opts</name> <value>-Dhdp.version=2.3.6.0-3796</value> </property> <property> <name>yarn.app.mapreduce.am.command-opts</name> <value>-Xmx4915m -Dhdp.version=${hdp.version}</value> </property> <property> <name>yarn.app.mapreduce.am.log.level</name> <value>INFO</value> </property> <property> <name>yarn.app.mapreduce.am.resource.mb</name> <value>6144</value> </property> <property> <name>yarn.app.mapreduce.am.staging-dir</name> <value>/user</value> </property> </configuration> Thanks, Ben ________________________________ From: Rohith Sharma K S [[email protected]] Sent: Thursday, August 18, 2016 3:17 AM To: Gao, Yunlong Cc: [email protected]; Benjamin Ross Subject: Re: Issue with Hadoop Job History Server MR jobs and JHS should have same configurations for done-dir if configured. Otherwise staging-dir should be same for both. Make sure both Job and JHS has same configurations value. Usually what would happen is , MRApp writes job file in one location and HistoryServer trying to read from different location. This causes, JHS to display empty jobs. Thanks & Regards Rohith Sharma K S On Aug 18, 2016, at 12:35 PM, Gao, Yunlong <[email protected]<mailto:[email protected]>> wrote: To whom it may concern, I am using Hadoop 2.7.1.2.3.6.0-3796, with the Hortonworks distribution of HDP-2.3.6.0-3796. I have a question with the Hadoop Job History sever. After I set up everything, the resource manager/name nodes/data nodes seem to be running fine. But the job history server is not working correctly. The issue with it is that the UI of the job history server does not show any jobs. And all the rest calls to the job history server do not work either. Also notice that there is no logs in HDFS under the directory of "mapreduce.jobhistory.done-dir" I have tried with different things, including restarting the job history server and monitor the log -- no error/exceptions is observed. I also rename the /hadoop/mapreduce/jhs/mr-jhs-state for the state recovery of job history server, and then restart it again, but no particular error happens. I tried with some other random stuff that I borrowed from online blogs/documents but got no luck. Any help would be very much appreciated. Thanks, Yunlong Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report this email as spam. This message has been scanned for malware by Websense. www.websense.com
