I found a thread about this topic that's a year old and at that time seemed to 
give no hope, I'm just wondering if the situation has changed.  My testing so 
far isn't encouraging.

In the thread (here: https://groups.google.com/g/slurm-users/c/yhnSVBoohik) it 
talks about wanting to give lower priority jobs some amount of guaranteed run 
time.  That's what we're trying to do.

Over global settings are PreemptMode=SUSPEND,GANG and 
PreemptType=preempt/partition_prio.  We have a high priority partition that 
nothing should ever preempt, and an open partition that is always preemptable.  
In between is a burst partition.  It can be preempted if the high priority 
partition needs the resources.  That's the partition we'd like to guarantee a 1 
hour run time on.  Looking at the sacctmgr man page, it gives this info on QOS:

PreemptExemptTime
              Specifies a minimum run time for jobs of this QOS before they are 
considered for preemption. This QOS option takes precedence over the global 
PreemptExemptTime. This  is  only honored for PreemptMode=REQUEUE and 
PreemptMode=CANCEL.

This sounds like exactly what we want.  So I went into the burst QOS we have 
available on the burst partition and I set a preemptExemptTime of 30 seconds 
and a preemptMode of cancel, and tested.  Whenever something of a higher 
priority came along, my job was immediately cancelled, no exempt time was 
utliized.

Am I not understanding how this is supposed to work, or am I asking for an 
impossible slurm configuration?

Thanks,

Rob



Reply via email to