My cluster has 9 machines with total memory 643G and 288 vCores.I use 
fair-scheduler  as my scheduler.
Before I turn preemption on , my fair-scheduler.xml is :

<allocations>
    <queue name="highPriority">
       <minResources>100000 mb, 30 vcores</minResources>
       <maxResources>250000 mb, 100 vcores</maxResources>
    </queue>
    <queue name="default">
       <minResources>50000 mb, 20 vcores</minResources>
       <maxResources>100000 mb, 50 vcores</maxResources>
       <maxAMShare>-1.0f</maxAMShare>
    </queue>
    <queue name="ep">
       <minResources>100000 mb, 30 vcores</minResources>
       <maxResources>300000 mb, 100 vcores</maxResources>
       <maxAMShare>-1.0f</maxAMShare>
    </queue>
    <queue name="vip">
       <minResources>30000 mb, 20 vcores</minResources>
       <maxResources>60000 mb, 50 vcores</maxResources>
       <maxAMShare>-1.0f</maxAMShare>
     </queue>
  <fairSharePreemptionTimeout>300</fairSharePreemptionTimeout>
</allocations>

The yarn-site.xml is:
        <property>
                <name>yarn.scheduler.fair.preemption</name>
                <value>false</value>
        </property>


The sum of maxResources of each queue is equal to the total memory of the 
cluster. Of course ,due to the limitation of  maxResources

  for each queue , when a queue if full and other queues is idle , it still 
cannot get any resources from other queues.So , the total resource usage of my 
cluster is always low.


So I turn the preemption on and change the fair-scheduler.xml to this:


<allocations>
    <queue name="highPriority">
       <minResources>100000 mb, 30 vcores</minResources>
       <maxResources>500000 mb, 100 vcores</maxResources>
       <weight>0.35</weight>
       <minSharePreemptionTimeout>20</minSharePreemptionTimeout>
       <fairSharePreemptionTimeout>25</fairSharePreemptionTimeout>
       <fairSharePreemptionThreshold>0.8</fairSharePreemptionThreshold>
    </queue>
    <queue name="default">
       <minResources>25000 mb, 20 vcores</minResources>
       <maxResources>225000 mb, 70 vcores</maxResources>
       <weight>0.14</weight>
       <minSharePreemptionTimeout>20</minSharePreemptionTimeout>
       <fairSharePreemptionTimeout>25</fairSharePreemptionTimeout>
       <fairSharePreemptionThreshold>0.5</fairSharePreemptionThreshold>
       <maxAMShare>-1.0f</maxAMShare>
    </queue>
    <queue name="ep">
       <minResources>200000 mb, 30 vcores</minResources>
       <maxResources>600000 mb, 100 vcores</maxResources>
       <weight>0.42</weight>
       <minSharePreemptionTimeout>20</minSharePreemptionTimeout>
       <fairSharePreemptionTimeout>25</fairSharePreemptionTimeout>
       <fairSharePreemptionThreshold>0.8</fairSharePreemptionThreshold>
       <maxAMShare>-1.0f</maxAMShare>
    </queue>
    <queue name="vip">
       <minResources>50000 mb, 20 vcores</minResources>
       <maxResources>120000 mb, 30 vcores</maxResources>
       <weight>0.09</weight>
       <minSharePreemptionTimeout>20</minSharePreemptionTimeout>
       <fairSharePreemptionTimeout>25</fairSharePreemptionTimeout>
       <fairSharePreemptionThreshold>0.8</fairSharePreemptionThreshold>
       <maxAMShare>-1.0f</maxAMShare>
     </queue>
</allocations>
ye Y
I double the maxResources for each queue .I know that even preemption is on , 
the total memory of each queue will not excess the maxResources config.
Yes , the cluster resource usage can goes to 90% , it proves that the total 
cluster resource usage improves , but yarn applications always takes much more 
time to finish.I don’t know what happened.
Anyone could give me some suggestions?


Reply via email to