Thanks a lot! It is now working better! Such a small parameter that I didn't know that exists and is not so common to modify.
Or בתאריך יום ה׳, 10 בינו׳ 2019 ב-16:31 מאת Hariharan < [email protected]>: > Not an expert on capacity scheduler but the above two are not queue-level > configurations, so I think the changes would not reflect on running > refreshqueues. You would need to restart the RM for the new values to take > effect. > > Thanks, > Hari > > On Thu, Jan 10, 2019 at 7:41 PM Or Raz <[email protected]> wrote: > >> I have googled more about it, and it seems like two parameters should >> define the "bin packing problem". >> According to >> https://hadoop.apache.org/docs/r2.9.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Other_Properties >> yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled is >> by default set to true and with parameter >> yarn.scheduler.capacity.per-node-heartbeat.maximum-container-assignments r >> set to -1 it can assign all the containers the Node manager "said" it is >> capable of (which could somehow explain the bin packing problem for the >> first Nodemanager who answer with a Heartbeat message). >> Following Apache's instructions, I have inserted to my >> *capacity-scheduler.xml* in hadoop/etc/hadoop folder >> >> <property> >> >> <name>yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled</name> >> <value>true</value> >> <description> >> Whether to allow multiple container assignments in one >> NodeManager heartbeat. Defaults to true. >> </description> >> </property> >> <property> >> >> <name>yarn.scheduler.capacity.per-node-heartbeat.maximum-container-assignments</name> >> <value>2</value> >> <description> >> If multiple-assignments-enabled is true, the maximum amount of >> containers that can be assigned in one NodeManager heartbeat. Defaults to >> -1, which sets no limit. >> </description> >> </property> >> I have checked the configuration file, and I am using the capacity >> scheduler (I have enabled >> yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled again >> just to be sure). >> Furthermore, after I have run "yarn rmadmin -refreshQueues" I haven't >> seen any change in the Mappers allocation nor Reducers. >> hadoop2@master:~$ yarn rmadmin -refreshQueues >> 19/01/10 16:06:33 INFO client.RMProxy: Connecting to ResourceManager at >> master/172.31.24.83:8033 >> >> What am I missing over here? >> >> Or >> >> >> בתאריך יום ד׳, 9 בינו׳ 2019 ב-23:57 מאת Or Raz <[email protected] >> >: >> >>> Thanks for the tips! >>> Because I haven't set any scheduler (on purpose) for YARN then, I am >>> using the default one (Capacity). >>> I have looked in yarn-site.xml and in the configuration tab (using >>> JobHistory UI), and both of the parameters that you have mentioned weren't >>> there (so they haven't been set). >>> You said that I should look at "locality settings" can you be more >>> specific on what and where to look? >>> Also, it is worth mentioning that I am using three computers and the >>> replication factor (of HDFS) is three too. Thus, every data (even input) >>> would be on every computer, and the memory of each computer is the same >>> (two t2.xlarge and one m4.xlarge) while I am >>> using DefaultResourceCalculator. >>> >>> Or >>> >>> בתאריך יום ד׳, 9 בינו׳ 2019 ב-23:28 מאת Aaron Eng <[email protected] >>> >: >>> >>>> The settings are very relevant to having an equal number of containers >>>> running on each node if you have an idle cluster and want to distribute >>>> containers for a single job. An application master submits requests for >>>> container allocations to the ResourceManager. The MRAppMaster will request >>>> all the map containers at once, the FairScheduler will find NodeManagers >>>> with capacity to fulfill the container requests. If assign multiple is >>>> enabled then you generally won't get an even number of containers assigned >>>> to each node +/- 1 container. Before you say it's not relevant, you should >>>> check if your environment uses the FairScheduler and whether multiple >>>> assignment is enabled. If so, that's likely why there isn't an even >>>> assignment +/- 1 container. If not using FairScheduler and/or multiple >>>> assign, then you should look at locality settings, which can cause >>>> containers to be preferentially run on a subset of nodes, resulting in an >>>> uneven container assignment per node. >>>> >>>> On Wed, Jan 9, 2019 at 2:19 PM Or Raz <[email protected]> wrote: >>>> >>>>> As far as I know, the scheduler in YARN is only scheduling the jobs >>>>> and not the containers inside each job. Therefore, I don't believe it is >>>>> relevant. >>>>> Also, I haven't used or set those two parameters, and I haven't picked >>>>> nor set any particular schedule for my research (Fair, FIFO or Capacity). >>>>> Please correct if I am wrong. >>>>> P.S. currently I have no interest in a situation when I run a few jobs >>>>> concurrently, my case is much simpler with one job that I would like that >>>>> allocation of containers will be more balanced... >>>>> Or >>>>> >>>>> >>>>> בתאריך יום ד׳, 9 בינו׳ 2019 ב-19:11 מאת Aaron Eng <[email protected] >>>>> >: >>>>> >>>>>> Have you checked the yarn.scheduler.fair.assignmultiple >>>>>> and yarn.scheduler.fair.max.assign parameters for the ResourceManager >>>>>> configuration? >>>>>> >>>>>> On Wed, Jan 9, 2019 at 9:49 AM Or Raz <[email protected]> wrote: >>>>>> >>>>>>> How can I change/suggest a different allocation of containers to >>>>>>> tasks in Hadoop? Regarding a native Hadoop (2.9.1) cluster on AWS. >>>>>>> >>>>>>> I am running a native Hadoop cluster (2.9.1) on AWS (with EC2, not >>>>>>> EMR) and I want the scheduling/allocating of the containers >>>>>>> (Mappers/Reducers) would be more balanced than it is currently. It seems >>>>>>> like RM is assigning the Mappers in a Bin Packing way (where the data >>>>>>> resides) and for the reducers, it looks more balanced. My setup includes >>>>>>> three Machines with replication rate three (all the data is on every >>>>>>> machine), and I run my jobs with >>>>>>> mapreduce.job.reduce.slowstart.completedmaps=0 to start shuffle as fast >>>>>>> as >>>>>>> possible (It is vital for me that all the containers are working in >>>>>>> concurrency, it is a must condition). Also, according to the EC2 >>>>>>> instances >>>>>>> I have chosen and my settings of the YARN cluster, I can run at most 93 >>>>>>> containers (31 each). >>>>>>> >>>>>>> For example, if I want to have nine reducers then (93-9-1=83), 83 >>>>>>> containers could be left for the mappers, and one is for the AM. I have >>>>>>> played with the size of split input >>>>>>> (mapreduce.input.fileinputformat.split.minsize, >>>>>>> mapreduce.input.fileinputformat.split.maxsize) to find the right balance >>>>>>> where all of the machines have the same "work" for the map phase. But it >>>>>>> seems like the first 31 mappers would be allocated in one computer, the >>>>>>> next 31 to the second one and the last 31 in the last machine. Thus, I >>>>>>> can >>>>>>> try to use 87 mappers where 31 of them in Machine #1, another 31 in >>>>>>> Machine >>>>>>> #2 and another 25 in Machine #3 and the rest is left for the reducers >>>>>>> and >>>>>>> as Machine #1 and Machine #2 are fully occupied then the reducers would >>>>>>> have to be placed in Machine #3. This way I get an almost balanced >>>>>>> allocation of mappers at the expense of unbalanced reducers allocation. >>>>>>> And >>>>>>> this is not what I want... >>>>>>> >>>>>>> # of mappers = size_input / split size [Bytes] >>>>>>> >>>>>>> split size >>>>>>> =max(mapreduce.input.fileinputformat.split.minsize,min(mapreduce.input.fileinputformat.split.maxsize, >>>>>>> dfs.blocksize)) >>>>>>> >>>>>>
