yes my job has about 160,000 maps and my cluster not getting fully utilized
around 6000 maps ran for 2 hrs and then I killed the job. At any point of
time only 40 containers are running thats just 11% of my cluster capacity.
{
"classification": "mapred-site",
"properties": {
"mapreduce.job.reduce.slowstart.completedmaps":"1",
"mapreduce.reduce.memory.mb": "3072",
"mapreduce.map.memory.mb": "2208",
"mapreduce.map.java.opts":"-Xmx1800m",
"mapreduce.map.cpu.vcores":"1"
}
},
{
"classification": "yarn-site",
"properties": {
"yarn.scheduler.minimum-allocation-mb": "32”,
“yarn.scheduler.maximum-allocation-mb”:”253952”,
“yarn.scheduler.maximum-allocation-vcores: “128”
"yarn.nodemanager.vmem-pmem-ratio":"3",
"yarn.nodemanager.vmem-check-enabled":"true",
yarn.nodemanager.resource.cpu-vcores" ; "16”,
yarn.nodemanager.resource.memory-mb: “23040"
}
Each node: capacity
Disk-space=100gb
memory=28gb
processors: 8