Any way I can tell hadoop to use the /mnt dir instead of
/tmp/hadoop-{user-name} directory to store the files?Thanks Bhushan Pathak Thanks Bhushan Pathak On Wed, Jun 14, 2017 at 3:06 PM, Brahma Reddy Battula < [email protected]> wrote: > > > > > Please see my comments inline. > > > > > > > > Regards > > Brahma Reddy Battula > > > > *From:* Bhushan Pathak [mailto:[email protected]] > *Sent:* 14 June 2017 17:14 > *To:* [email protected] > *Subject:* HDFS file replication to slave nodes not working > > > > Hello, > > > > I have hadoop 2.7.3 running on a 3-node cluster [1 master, 2 slaves]. The > hdfs-site.xml file has the following config - > > <property> > > <name>dfs.namenode.name.dir</name> > > <value>file:/mnt/hadoop_store/datanode</value> > > </property> > > <property> > > <name>dfs.datanode.name.dir</name> > > <value>file:/mnt/hadoop_store/namenode</value> > > </property> > > > > =è *property should be “dfs.datanode.data.dir”. Please have a look > following for all default configurations.* > > *http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml > <http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml>* > > > > I used the 'hdfs -put' command to upload 3 csv files to HDFS, which was > successful. > > > > My assumption is that the 3 csv files should be present on all 3 nodes, > either under the datanode or the namenode directory. On the master, I can > see the following files - > > > > [hadoop@master hadoop-2.7.3]$ bin/hdfs dfs -ls /usr/hadoop > > Found 3 items > > -rw-r--r-- 3 hadoop supergroup 124619 2017-06-14 14:34 > /usr/hadoop/Final_Album_file.csv > > -rw-r--r-- 3 hadoop supergroup 68742 2017-06-14 14:34 > /usr/hadoop/Final_Artist_file.csv > > -rw-r--r-- 3 hadoop supergroup 2766110 2017-06-14 14:34 > /usr/hadoop/Final_Tracks_file.csv > > [hadoop@master hadoop-2.7.3]$ ls /mnt/hadoop_store/namenode/ > > [hadoop@master hadoop-2.7.3]$ ls /mnt/hadoop_store/datanode/ > > current in_use.lock > > [hadoop@master hadoop-2.7.3]$ ls /mnt/hadoop_store/datanode/current/ > > edits_0000000000000000001-0000000000000000002 > edits_0000000000000000027-0000000000000000028 > edits_0000000000000000055-0000000000000000056 > > edits_0000000000000000003-0000000000000000004 > edits_0000000000000000029-0000000000000000030 > edits_0000000000000000057-0000000000000000058 > > edits_0000000000000000005-0000000000000000006 > edits_0000000000000000031-0000000000000000032 > edits_0000000000000000059-0000000000000000060 > > edits_0000000000000000007-0000000000000000008 > edits_0000000000000000033-0000000000000000034 > edits_0000000000000000061-0000000000000000064 > > edits_0000000000000000009-0000000000000000010 > edits_0000000000000000035-0000000000000000036 > edits_0000000000000000065-0000000000000000096 > > edits_0000000000000000011-0000000000000000012 > edits_0000000000000000037-0000000000000000038 > edits_inprogress_0000000000000000097 > > edits_0000000000000000013-0000000000000000014 > edits_0000000000000000039-0000000000000000040 > fsimage_0000000000000000064 > > edits_0000000000000000015-0000000000000000016 > edits_0000000000000000041-0000000000000000042 > fsimage_0000000000000000064.md5 > > edits_0000000000000000017-0000000000000000017 > edits_0000000000000000043-0000000000000000044 > fsimage_0000000000000000096 > > edits_0000000000000000018-0000000000000000019 > edits_0000000000000000045-0000000000000000046 > fsimage_0000000000000000096.md5 > > edits_0000000000000000020-0000000000000000020 > edits_0000000000000000047-0000000000000000048 > seen_txid > > edits_0000000000000000021-0000000000000000022 > edits_0000000000000000049-0000000000000000050 > VERSION > > edits_0000000000000000023-0000000000000000024 edits_0000000000000000051- > 0000000000000000052 > > edits_0000000000000000025-0000000000000000026 edits_0000000000000000053- > 0000000000000000054 > > [hadoop@master hadoop-2.7.3]$ > > > > While on the 2 slave nodes, there are only empty directories. Is my > assumption that the 3 csv files should be replicated to slave nodes as well > correct? If yes, why are they missing from the slave nodes? Additionally, > are the files that I see in datanode/current directory of master the actual > csv files that I have uploaded? > > > > > > *Yes, it will replicate to 3 nodes.(it’s based on *“dfs.replication” *which > is “3” by default)* > > *The location which you are checking is wrong, since property is > wrong..and by default it will stored under “/tmp/Hadoop-${user-name}}. * > > *Data under “datanode/current directory” is meta data for all operations.* > > > > *Please go through the following design to know more about HDFS.* > > *http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html > <http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html>* > > > > > > Thanks > > Bhushan Pathak >
