Thank you very much for your comments. Oddly enough, I came up with the 
3-partition model as well once I'd sent my email. So, your comments helped to 
confirm that I was thinking on the right lines.

Best regards,
David

________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Thomas 
M. Payerle <paye...@umd.edu>
Sent: 06 October 2020 18:50
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Controlling access to idle nodes

We use a scavenger partition, and although we do not have the policy you 
describe, it could be used in your case.

Assume you have 6 nodes (node-[0-5]) and two groups A and B.
Create partitions
partA = node-[0-2]
partB = node-[3-5]
all = node-[0-6]

Create QoSes normal and scavenger.
Allow normal QoS to preempt jobs with scavenger QoS

In sacctmgr, give members of group A access to use partA with normal QoS  and 
group B access to use partB with normal QoS
Allow both A and B to use part all with scavenger QoS.

So members of A can launch jobs on partA with normal QoS (probably want to make 
that their default), and similarly member of B can launch jobs on partB with 
normal QoS.
But membes of A can also launch jobs on partB with scavenger QoS and vica 
versa.  If the partB nodes used by A are needed by B, they will get preempted.

This is not automatic (users need to explicitly say they want to run jobs on 
the other half of the cluster), but that is probably reasonable because there 
are some jobs one does not wish to get preempted even if they have to wait a 
while in the queue to ensure such.

On Tue, Oct 6, 2020 at 11:12 AM David Baker 
<d.j.ba...@soton.ac.uk<mailto:d.j.ba...@soton.ac.uk>> wrote:
Hello,

I would appreciate your advice on how to deal with this situation in Slurm, 
please. If I have a set of nodes used by 2 groups, and normally each group 
would each have access to half the nodes. So, I could limit each group to have 
access to 3 nodes each, for example. I am trying to devise a scheme that allows 
each group to make best use of the node always. In other words, each group 
could potentially use all the nodes (assuming they all free and the other group 
isn't using the nodes at all).

I cannot set hard and soft limits in slurm, and so I'm not sure how to make the 
situation flexible. Ideally It would be good for each group to be able to use 
their allocation and then take advantage of any idle nodes via a scavenging 
mechanism. The other group could then pre-empt the scavenger jobs and claim 
their nodes. I'm struggling with this since this seems like a two-way scavenger 
situation.

Could anyone please help? I have, by the way, set up partition-based 
pre-emption in the cluster. This allows the general public to scavenge nodes 
owned by research groups.

Best regards,
David




--
Tom Payerle
DIT-ACIGS/Mid-Atlantic Crossroads        paye...@umd.edu<mailto:paye...@umd.edu>
5825 University Research Park               (301) 405-6135
University of Maryland
College Park, MD 20740-3831

Reply via email to