Hi Marcus,
Following jobs are running or pending after I killed job 100816, which
was running on computelab-134's T4:
100815 RUNNING computelab-134 gpu:gv100:1 None1
100817 PENDING gpu:gv100:1 Resources1
100818 PENDING gpu:tu104:1 Resources1
$ scontrol -d show node computelab-134
NodeName=computelab-134 Arch=x86_64 CoresPerSocket=6
CPUAlloc=6 CPUErr=0 CPUTot=12 CPULoad=0.00
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=gpu:gv100:1,gpu:tu104:1
GresDrain=N/A
GresUsed=gpu:gv100:1(IDX:0),gpu:tu104:0(IDX:N/A)
NodeAddr=computelab-134 NodeHostName=computelab-134 Version=17.11
OS=Linux 4.4.0-143-generic #169-Ubuntu SMP Thu Feb 7 07:56:38 UTC 2019
RealMemory=64307 AllocMem=32148 FreeMem=61126 Sockets=2 Boards=1
State=MIXED ThreadsPerCore=1 TmpDisk=404938 Weight=1 Owner=N/A
MCS_label=N/A
Partitions=test-backfill
BootTime=2019-03-29T12:09:25 SlurmdStartTime=2019-04-01T11:34:35
CfgTRES=cpu=12,mem=64307M,billing=12,gres/gpu=2,gres/gpu:gv100=1,gres/gpu:tu104=1
AllocTRES=cpu=6,mem=32148M,gres/gpu=1,gres/gpu:gv100=1
CapWatts=n/a
CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
$ scontrol -d show job 100815
JobId=100815 JobName=bash
UserId=rradmer(27578) GroupId=hardware(30) MCS_label=N/A
Priority=1 Nice=0 Account=cag QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
DerivedExitCode=0:0
RunTime=00:06:45 TimeLimit=02:00:00 TimeMin=N/A
SubmitTime=2019-04-02T05:13:05 EligibleTime=2019-04-02T05:13:05
StartTime=2019-04-02T05:13:05 EndTime=2019-04-02T07:13:05 Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
LastSchedEval=2019-04-02T05:13:05
Partition=test-backfill AllocNode:Sid=computelab-frontend-02:7873
ReqNodeList=computelab-134 ExcNodeList=(null)
NodeList=computelab-134
BatchHost=computelab-134
NumNodes=1 NumCPUs=6 NumTasks=1 CPUs/Task=6 ReqB:S:C:T=0:0:*:*
TRES=cpu=6,mem=32148M,node=1,billing=6,gres/gpu=1,gres/gpu:gv100=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
Nodes=computelab-134 CPU_IDs=0-5 Mem=32148 GRES_IDX=gpu:gv100(IDX:0)
MinCPUsNode=6 MinMemoryNode=32148M MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
Gres=gpu:gv100:1 Reservation=(null)
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/bin/bash
WorkDir=/home/rradmer
Power=
$ scontrol -d show job 100817
JobId=100817 JobName=bash
UserId=rradmer(27578) GroupId=hardware(30) MCS_label=N/A
Priority=1 Nice=0 Account=cag QOS=normal
JobState=PENDING Reason=Resources Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
DerivedExitCode=0:0
RunTime=00:00:00 TimeLimit=02:00:00 TimeMin=N/A
SubmitTime=2019-04-02T05:13:11 EligibleTime=2019-04-02T05:13:11
StartTime=2019-04-02T07:13:05 EndTime=2019-04-02T09:13:05 Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
LastSchedEval=2019-04-02T05:20:44
Partition=test-backfill AllocNode:Sid=computelab-frontend-03:21736
ReqNodeList=computelab-134 ExcNodeList=(null)
NodeList=(null) SchedNodeList=computelab-134
NumNodes=1-1 NumCPUs=6 NumTasks=1 CPUs/Task=6 ReqB:S:C:T=0:0:*:*
TRES=cpu=6,mem=32148M,node=1,gres/gpu=1,gres/gpu:gv100=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=6 MinMemoryNode=32148M MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
Gres=gpu:gv100:1 Reservation=(null)
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/bin/bash
WorkDir=/home/rradmer
Power=
$ scontrol -d show job 100818
JobId=100818 JobName=bash
UserId=rradmer(27578) GroupId=hardware(30) MCS_label=N/A
Priority=1 Nice=0 Account=cag QOS=normal
JobState=PENDING Reason=Resources Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
DerivedExitCode=0:0
RunTime=00:00:00 TimeLimit=02:00:00 TimeMin=N/A
SubmitTime=2019-04-02T05:13:12 EligibleTime=2019-04-02T05:13:12
StartTime=2019-04-02T09:13:00 EndTime=2019-04-02T11:13:00 Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
LastSchedEval=2019-04-02T05:21:32
Partition=test-backfill AllocNode:Sid=computelab-frontend-02:12826
ReqNodeList=computelab-134 ExcNodeList=(null)
NodeList=(null) SchedNodeList=computelab-134
NumNodes=1-1 NumCPUs=6 NumTasks=1 CPUs/Task=6 ReqB:S:C:T=0:0:*:*
TRES=cpu=6,mem=32148M,node=1,gres/gpu=1,gres/gpu:tu104=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=6 MinMemoryNode=32148M MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
Gres=gpu:tu104:1 Reservation=(null)
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/bin/bash
WorkDir=/home/rradmer
Power=
On Mon, Apr 1, 2019 at 11:24 PM Marcus Wagner
<wag...@itc.rwth-aachen.de <mailto:wag...@itc.rwth-aachen.de>> wrote:
Dear Randall,
could you please also provide
scontrol -d show node computelab-134
scontrol -d show job 100091
scontrol -d show job 100094
Best
Marcus
On 4/1/19 4:31 PM, Randall Radmer wrote:
I can’t get backfill to work for a machine with two GPUs (one is
a P4 and the other a T4).
Submitting jobs works as expected: if the GPU I request is free,
then my job runs, otherwise it goes into a pending state. But if
I have pending jobs for one GPU ahead of pending jobs for the
other GPU, I see blocking issues.
More specifically, I can create a case where I am running a job
on each of the GPUs and have a pending job waiting for the P4
followed by a pending job waiting for a T4. I would expect that
if I exit the running T4 job, then backfill would start the
pending T4 job, even though it has to job ahead of the pending P4
job. This does not happen...
The following shows my jobs after I exited from a running T4 job,
which had ID 100092:
$ squeue --noheader -u rradmer
--Format=jobid,state,gres,nodelist,reason | sed 's/ */ /g' | sort
100091 RUNNING gpu:gv100:1 computelab-134 None
100093 PENDING gpu:gv100:1 Resources
100094 PENDING gpu:tu104:1 Resources
I can find no reason why 100094 doesn’t start running (I’ve
waited up to an hour, just to make sure).
System config info and log snippets shown below.
Thanks much,
Randy
Node state corresponding to the squeue command, shown above:
$ scontrol show node computelab-134 | grep -i [gt]res
Gres=gpu:gv100:1,gpu:tu104:1
CfgTRES=cpu=12,mem=64307M,billing=12,gres/gpu=2,gres/gpu:gv100=1,gres/gpu:tu104=1
AllocTRES=cpu=6,mem=32148M,gres/gpu=1,gres/gpu:gv100=1
Slurm config follows:
$ scontrol show conf | grep -Ei '(gres|^Sched|prio|vers)'
AccountingStorageTRES =
cpu,mem,energy,node,billing,gres/gpu,gres/gpu:gp100,gres/gpu:gp104,gres/gpu:gv100,gres/gpu:tu102,gres/gpu:tu104,gres/gpu:tu106
GresTypes = gpu
PriorityParameters = (null)
PriorityDecayHalfLife = 7-00:00:00
PriorityCalcPeriod = 00:05:00
PriorityFavorSmall = No
PriorityFlags =
PriorityMaxAge = 7-00:00:00
PriorityUsageResetPeriod = NONE
PriorityType = priority/multifactor
PriorityWeightAge = 0
PriorityWeightFairShare = 0
PriorityWeightJobSize = 0
PriorityWeightPartition = 0
PriorityWeightQOS = 0
PriorityWeightTRES = (null)
PropagatePrioProcess = 0
SchedulerParameters =
default_queue_depth=2000,bf_continue,bf_ignore_newly_avail_nodes,bf_max_job_test=1000,bf_window=10080,kill_invalid_depend
SchedulerTimeSlice = 30 sec
SchedulerType = sched/backfill
SLURM_VERSION = 17.11.9-2
GPUs on node:
$ nvidia-smi --query-gpu=index,name,gpu_bus_id --format=csv
index, name, pci.bus_id
0, Tesla T4, 00000000:82:00.0
1, Tesla P4, 00000000:83:00.0
The gres file on node:
$ cat /etc/slurm/gres.conf
Name=gpu Type=tu104 File=/dev/nvidia0 Cores=0,1,2,3,4,5
Name=gpu Type=gp104 File=/dev/nvidia1 Cores=6,7,8,9,10,11
Random sample of SlurmSchedLogFile:
$ sudo tail -3 slurm.sched.log
[2019-04-01T08:14:23.727] sched: Running job scheduler
[2019-04-01T08:14:23.728] sched: JobId=100093. State=PENDING.
Reason=Resources. Priority=1. Partition=test-backfill.
[2019-04-01T08:14:23.728] sched: JobId=100094. State=PENDING.
Reason=Resources. Priority=1. Partition=test-backfill.
Random sample of SlurmctldLogFile:
$ sudo grep backfill slurmctld.log | tail -5
[2019-04-01T08:16:53.281] backfill: beginning
[2019-04-01T08:16:53.281] backfill test for JobID=100093 Prio=1
Partition=test-backfill
[2019-04-01T08:16:53.281] backfill test for JobID=100094 Prio=1
Partition=test-backfill
[2019-04-01T08:16:53.281] backfill: reached end of job queue
[2019-04-01T08:16:53.281] backfill: completed testing 2(2) jobs,
usec=707
--
Marcus Wagner, Dipl.-Inf.
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de <mailto:wag...@itc.rwth-aachen.de>
www.itc.rwth-aachen.de <http://www.itc.rwth-aachen.de>