Can you give us the output of 
# control show job 6982

Could be an issue with requesting too many CPUs or something…


Merlin
--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
Cambridge, CB2 0XY
United Kingdom

> On 29 Nov 2017, at 15:21, Christian Anthon <ant...@rth.dk> wrote:
> 
> Hi,
> 
> I have a problem with a newly setup slurm-17.02.7-1.el6.x86_64 that jobs 
> seems to be stuck in ReqNodeNotAvail:
> 
>               6982     panic  Morgens    ferro PD       0:00 1 
> (ReqNodeNotAvail, UnavailableNodes:)
>               6981     panic     SPEC    ferro PD       0:00 1 
> (ReqNodeNotAvail, UnavailableNodes:)
> 
> The nodes are fully allocated in terms of memory, but not all cpu resources 
> are consumed
> 
> PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
> _default     up   infinite     19    mix 
> clone[05-11,25-29,31-32,36-37,39-40,45]
> _default     up   infinite     11  alloc alone[02-08,10-13]
> fastlane     up   infinite     19    mix 
> clone[05-11,25-29,31-32,36-37,39-40,45]
> fastlane     up   infinite     11  alloc alone[02-08,10-13]
> panic        up   infinite     19    mix 
> clone[05-11,25-29,31-32,36-37,39-40,45]
> panic        up   infinite     12  alloc alone[02-08,10-13,15]
> free*        up   infinite     19    mix 
> clone[05-11,25-29,31-32,36-37,39-40,45]
> free*        up   infinite     11  alloc alone[02-08,10-13]
> 
> Possibly relevant lines in slurm.conf (full slurm.conf attached)
> 
> SchedulerType=sched/backfill
> SelectType=select/cons_res
> SelectTypeParameters=CR_CPU_Memory
> TaskPlugin=task/none
> FastSchedule=1
> 
> Any advice?
> 
> Cheers, Christian.
> 
> <slurm.conf>

Reply via email to