400GB as heap space for Namenode is bit high. The GC pause time will be
very high.
For a cluster with about 6PB, approx 20GB is decent memory.
As you mentioned it is HA, so it is safe to assume that the fsimage is
check pointed at regular intervals and we do not need to worry during a
manual restart of namenode, about the memory to play edits into fsimage.
But still it is good to account for as delta. But still not 400GB.
A good way to estimate:
Some of my tests:
writing about 2TB of data on HDFS with block size = 128 MB, replication
3 - creates about 18k blocks (18051).
*Memory needed for that blocks:*
hdfs oiv -p XML -printToScreen -i
/mnt/namenode/current/fsimage_0000000000000051228 | egrep "block|inode"
| wc -l | awk '{printf "Objects=%d : Suggested Xms=%0dm Xmx=%0dm\n", $1,
(($1 / 1000000 )*1024), (($1 / 1000000 )*1024)}'
Objects=18051 : Suggested Xms=18m Xmx=18m
*Maths for Cluster:*
----------------
150 bytes per object(object is block, file, directory)
24 TB x 2000 nodes = 48000 TB
Block size = 128 MB
Total blocks = 48000TB/128MB = 393216000 Blocks
Adjusting for replication factor, which is 3 by default. As each
replicated block just takes about 16 bytes in memory of namenode.
393216000/3 = 131072000 x 150 + (16 bytes x 131072000 blocks) =
19660800000 + 2097152000 =*20.23 GB*
In addition to this memory is needed for namespace metadata -> Each file
name will also be accounted for 150 bytes of Namenode memory
On 18/8/17 3:19 pm, [email protected] wrote:
Hi All,
HDFS Federation with PB+ rest data (Single Name Service is HA, Based
on QJM) , Apache 2.7.3 on Redhat 6.5 with JDK1.7.
1.Plan to deploy NN on server(32cores, 512G) , any precious advice
about JVM OPTS? If set heap size to about 400G with CMS GC collector,
any obvious problems?
2.If there are many groups of Name Services, is it more efficient
that part of Name Services share one group of JNs? Any advice to JN?
3.Welcome any words to Federation, thanks!
Thanks in advance,
Doris
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]