Hi Sachin, On Mon, Dec 18, 2017 at 9:09 AM, Sachin Tiwari <[email protected]> wrote:
> Hi > > I am trying to use hadoop as distributed file storage system. > > I did a POC with a small cluster with 1 name-node and 4 datanodes and I > was able to get/put files using hdfs client and monitor the datanodes > status on: http://master-machine:50070/dfshealth.html > > However, I have few open questions that I would like to discuss with you > guys before, before taking the solution to next level. > > *Question are as follows:* > > 1) Is hdfs good in handling binary data? Like executable, zip, VDI, etc? > Yes for the storage. For the processing with Yarn it will depend on your processing and format. > > 2) How many datanodes a namenode can handle? Assuming it's running on 24 > core, 90GB RAM and handling files b/w 200MB to 1GB in size? (assuming > deafult block size 128) > A name node is more limited by the file number not the datanode number. A namenode is able to manage several hundred of datanode. An exemple of sizing : https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_command-line-installation/content/configuring-namenode-heap-size.html > > 3) Is there are way to tune the cluster setup i.e. determine best value > for block size, replication factor, heap etc? > To tune the configuration you can change the value at cluster level or by folder/file. List of values : https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml At least : dfs.replication, dfs.replication.max, dfs.namenode.replication.min and dfs.blocksize. I don't understand what do you mean with 'heap size' for HDFS. > > 4) I was also curious how much time does a namenode service takes to > acknowledge that a datanode has gone down? > Depend on you configuration. There is a heartbeat (dfs.heartbeat.interval) of 3s by default. But there are several parameters that define several status for a namenode (ok, stale and dead). seaa dfs.namenode.stale.datanode.interval and dfs.namenode.heartbeat.recheck-interval By default a namenode will not reveive write after 30s and block will be replicated after 10min30s. > > 5) What happens next? That is, does namenode starts replicating block of > that down datanode to other available datanodes to meet the replication > factor? > Yes > > 6) what happens when datanode comes back up? won't there more blocks > (replication) in system than expected as namenode has replicated them while > it was down? > Blocks won't be used. HDFS rebalancing will be required if your want to have data on all nodes. > > > 7) Also, after coming up does the datanode performs cleanup for the files > (blocks) that were pruned while the datanode was down? That is, reclaim the > diskdpace by deleting blocks that are deleted while it was down? > See before > > 8) During copying/replication does datanode with more available space get > priority over datanode with comparitively less space? > No > > 9) What are your recommendations for a cluster of around 2500 machines > with 24 core and 90GB RAM and 500MB to 1TB disk space to spare for HDFS? > Are there any good tools to manage such huge cluster to track its health > and other status? > A cluster with 2500 are very-very rare, need a very huge expertise (very fare from your actual questions) and years of expertise. My first recommandation is to start with small cluster (10-20), learn with it, automate every steps on provisioning and the most important : hire experts. > > 10) For a non-networking guy like me and not owner of the network topology > of the machineswhat is the best recommendation from your side to make > cluster rack-aware? I mean what should I do get benefited by rack-awareness > in the cluster? > With you ambition, hire or employ an expert for the cluster setup. The configuration is quite simple, the network configuration on nodes are more complexe (to have enough bandwidth with bonding) : https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-common/RackAwareness.html Regards, Philippe > > > Thanks, > Sachin > > > > > > > > > > > > -- Philippe Kernévez Directeur technique (Suisse), [email protected] +41 79 888 33 32 Retrouvez OCTO sur OCTO Talk : http://blog.octo.com OCTO Technology http://www.octo.ch
