Can you give us the output of # control show job 6982 Could be an issue with requesting too many CPUs or something…
Merlin -- Merlin Hartley Computer Officer MRC Mitochondrial Biology Unit Cambridge, CB2 0XY United Kingdom > On 29 Nov 2017, at 15:21, Christian Anthon <ant...@rth.dk> wrote: > > Hi, > > I have a problem with a newly setup slurm-17.02.7-1.el6.x86_64 that jobs > seems to be stuck in ReqNodeNotAvail: > > 6982 panic Morgens ferro PD 0:00 1 > (ReqNodeNotAvail, UnavailableNodes:) > 6981 panic SPEC ferro PD 0:00 1 > (ReqNodeNotAvail, UnavailableNodes:) > > The nodes are fully allocated in terms of memory, but not all cpu resources > are consumed > > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > _default up infinite 19 mix > clone[05-11,25-29,31-32,36-37,39-40,45] > _default up infinite 11 alloc alone[02-08,10-13] > fastlane up infinite 19 mix > clone[05-11,25-29,31-32,36-37,39-40,45] > fastlane up infinite 11 alloc alone[02-08,10-13] > panic up infinite 19 mix > clone[05-11,25-29,31-32,36-37,39-40,45] > panic up infinite 12 alloc alone[02-08,10-13,15] > free* up infinite 19 mix > clone[05-11,25-29,31-32,36-37,39-40,45] > free* up infinite 11 alloc alone[02-08,10-13] > > Possibly relevant lines in slurm.conf (full slurm.conf attached) > > SchedulerType=sched/backfill > SelectType=select/cons_res > SelectTypeParameters=CR_CPU_Memory > TaskPlugin=task/none > FastSchedule=1 > > Any advice? > > Cheers, Christian. > > <slurm.conf>