Hi Rafal!
Is there a way to force yarn to use configured above thresholds (70% and 30%) 
per node?
-Currently we can’t specify threshold per node.

As per your initial mail Yarn per node is ~50GB means all nodes resources are 
same. Any usecase specifically for per node allocation based on percentage?


From: Rafał Radecki [mailto:[email protected]]
Sent: 10 November 2016 14:59
To: Ravi Prakash
Cc: user
Subject: Re: Yarn 2.7.3 - capacity scheduler container allocation to nodes?

Hi Ravi.

I did not specify labels this time ;) I just created two queues as it is 
visible in the configuration.
Overall queues work but allocation of jobs is different then expected by me as 
I wrote at the beginning.

BR,
Rafal.

2016-11-10 2:48 GMT+01:00 Ravi Prakash 
<[email protected]<mailto:[email protected]>>:
Hi Rafal!
Have you been able to launch the job successfully first without configuring 
node-labels? Do you really need node-labels? How much total memory do you have 
on the cluster? Node labels are usually for specifying special capabilities of 
the nodes (e.g. some nodes could have GPUs and your application could request 
to be run on only the nodes which have GPUs)
HTH
Ravi

On Wed, Nov 9, 2016 at 5:37 AM, Rafał Radecki 
<[email protected]<mailto:[email protected]>> wrote:
Hi All.

I have a 4 node cluster on which I run yarn. I created 2 queues "long" and 
"short", first with 70% resource allocation, the second with 30% allocation. 
Both queues are configured on all available nodes by default.

My memory for yarn per node is ~50GB. Initially I thought that when I will run 
tasks in "short" queue yarn will allocate them on all nodes using 30% of the 
memory on every node. So for example if I run 20 tasks, 2GB each (40GB 
summary), in short queue:
- ~7 first will be scheduled on node1 (14GB total, 30% out of 50GB available on 
this node for "short" queue -> 15GB)
- next ~7 tasks will be scheduled on node2
- ~6 remaining tasks will be scheduled on node3
- yarn on node4 will not use any resources assigned to "short" queue.
But this seems not to be the case. At the moment I see that all tasks are 
started on node1 and other nodes have no tasks started.

I attached my yarn-site.xml and capacity-scheduler.xml.

Is there a way to force yarn to use configured above thresholds (70% and 30%) 
per node and not per cluster as a whole? I would like to get a configuration in 
which on every node 70% is always available for "short" queue, 70% for "long" 
queue and in case any resources are free for a particular queue they are not 
used by other queues. Is it possible?

BR,
Rafal.

---------------------------------------------------------------------
To unsubscribe, e-mail: 
[email protected]<mailto:[email protected]>
For additional commands, e-mail: 
[email protected]<mailto:[email protected]>


Reply via email to