Hi, I'm looking to have a way an administrator can boost any job to be next to run when resources become available. What is the best practice way to do this? Happy to try something new :-D
The way I thought to do this was to have a qos with a large priority and manually assign this to the job. Job 469 is the job in this example I am trying to elevate to be next in queue. scontrol update jobid=469 qos=boost sprio shows that this job is the highest priority by quite some way, however, job nbumber 492 will be next to run squeue (qxluding runnign jobs) JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 469 Backgroun sleeping centos PD 0:00 1 (Resources) 492 Priority sleepy.s superuse PD 0:00 1 (Resources) 448 Backgroun sleepy.s groupboo PD 0:00 1 (Resources) 478 Backgroun sleepy.s groupboo PD 0:00 1 (Resources) 479 Backgroun sleepy.s groupboo PD 0:00 1 (Resources) 480 Backgroun sleepy.s groupboo PD 0:00 1 (Resources) 481 Backgroun sleepy.s groupboo PD 0:00 1 (Resources) 482 Backgroun sleepy.s groupboo PD 0:00 1 (Resources) 483 Backgroun sleepy.s groupboo PD 0:00 1 (Resources) 484 Backgroun sleepy.s groupboo PD 0:00 1 (Resources) 449 Backgroun sleepy.s superuse PD 0:00 1 (Resources) 450 Backgroun sleepy.s superuse PD 0:00 1 (Resources) 465 Backgroun sleeping centos PD 0:00 1 (Resources) 466 Backgroun sleeping centos PD 0:00 1 (Resources) 467 Backgroun sleeping centos PD 0:00 1 (Resources) [root@master yp]# sprio JOBID PARTITION PRIORITY AGE FAIRSHARE JOBSIZE PARTITION QOS 448 Backgroun 13667 58 484 3125 10000 0 449 Backgroun 13205 58 23 3125 10000 0 450 Backgroun 13205 58 23 3125 10000 0 465 Backgroun 13157 32 0 3125 10000 0 466 Backgroun 13157 32 0 3125 10000 0 467 Backgroun 13157 32 0 3125 10000 0 469 Backgroun 10013157 32 0 3125 10000 10000000 478 Backgroun 13640 32 484 3125 10000 0 479 Backgroun 13640 32 484 3125 10000 0 480 Backgroun 13640 32 484 3125 10000 0 481 Backgroun 13610 32 454 3125 10000 0 482 Backgroun 13610 32 454 3125 10000 0 483 Backgroun 13610 32 454 3125 10000 0 484 Backgroun 13610 32 454 3125 10000 0 492 Priority 1003158 11 23 3125 1000000 0 I'm trying to troubleshoot why the highest priority job is not next to run, jobs in the partition called "Priority" seem to run first. The job 469 has no qos, partition, user accounts or group limits on the number of cpus,jobs,nodes etc. I've set this test cluster up from scratch to be sure! [root@master yp]# scontrol show job 469 JobId=469 JobName=sleeping.sh UserId=centos(1000) GroupId=centos(1000) MCS_label=N/A Priority=10013161 Nice=0 Account=default QOS=boost JobState=PENDING Reason=Resources Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A SubmitTime=2019-03-11T16:01:20 EligibleTime=2019-03-11T16:01:20 StartTime=2020-03-10T15:23:40 EndTime=Unknown Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2019-03-11T16:54:44 Partition=Background AllocNode:Sid=master:1322 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=1,node=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 Gres=(null) Reservation=(null) OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/centos/sleeping.sh WorkDir=/home/centos StdErr=/home/centos/sleeping.sh.e469 StdIn=/dev/null StdOut=/home/centos/sleeping.sh.o469 Power= The partition called "Priority" has a priority boost assigned through qos. PartitionName=Priority Nodes=compute[01-02] Default=NO MaxTime=INFINITE State=UP Priority=1000 QOS=Priority PartitionName=Background Nodes=compute[01-02] Default=YES MaxTime=INFINITE State=UP Priority=10 Any Ideas would be much appreciated. Sean -- -- Sean Brisbane | Linux Systems Specialist Securelinx Ltd., Pottery Road, Dun Laoghaire, Co. Dublin. Registered in Ireland No. 357396 www.securelinx.com <http://www.securelinx.com/> - Linux Leaders in Ireland