Hello
I am using Spark 1.6.2 and Hadoop 2.7.2 in a single node cluster
(Pseudo-Distributed Operation settings for testing propose).
Spark
# spark-defaults.conf
spark.driver.memory 512m
spark.yarn.am.memory 512m
spark.executor.memory 512m
spark.executor.cores 2
spark.dynamicAllocation.enabled true
spark.shuffle.service.enabled true
YARN
# yarn-site.xml
yarn.scheduler.maximum-allocation-vcores 32
yarn.scheduler.minimum-allocation-vcores 1
yarn.scheduler.maximum-allocation-mb 16384
yarn.scheduler.minimum-allocation-mb 64
yarn.scheduler.fair.preemption true
yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
yarn.nodemanager.aux-services spark_shuffle
# mapred-site.xml
yarn.app.mapreduce.am.resource.mb 512
yarn.app.mapreduce.am.resource.cpu-vcores 1
yarn.app.mapreduce.am.command-opts -Xmx384
mapreduce.map.memory.mb 1024
mapreduce.map.java.opts -Xmx768m
mapreduce.reduce.memory.mb 1024
mapreduce.reduce.java.opts -Xmx768m
For every spark application that I submit I always get:
- ApplicationMaster with 1024 MB of RAM and 1 vcore
- One container with 1024 MB of RAM and 1 vcore
I have three questions using dynamic allocation and Fair Scheduler:
1) How do I change ApplicationMaster max memory to 512m ?
2) How do I get more than one container running per application ? (using
dynamic allocation I cannot set the spark.executor.instances)
3) I noticed that YARN ignores yarn.app.mapreduce.am.resource.mb,
yarn.app.mapreduce.am.resource.cpu-vcores and
yarn.app.mapreduce.am.command-opts when the scheduler is Fair, am I right ?
Thanks,
Cleosson