Thanks, I believe the user must have resubmitted the job, hence the updated id.
Cheers, Christian JobId=6986 JobName=Morgens UserId=ferro(2166) GroupId=ferro(22166) MCS_label=N/A Priority=1031 Nice=0 Account=rth QOS=normal JobState=PENDING Reason=ReqNodeNotAvail,_UnavailableNodes: Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A SubmitTime=2017-11-29T21:02:38 EligibleTime=2017-11-29T21:02:38 StartTime=Unknown EndTime=Unknown Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 Partition=panic AllocNode:Sid=rnai01:5765 ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=1 NumCPUs=16 NumTasks=1 CPUs/Task=16 ReqB:S:C:T=0:0:*:* TRES=cpu=16,mem=32000,node=1 Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=* MinCPUsNode=16 MinMemoryCPU=2000M MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 Gres=(null) Reservation=(null) OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null) > Can you give us the output of > # control show job 6982 > > Could be an issue with requesting too many CPUs or something… > > > Merlin > -- > Merlin Hartley > Computer Officer > MRC Mitochondrial Biology Unit > Cambridge, CB2 0XY > United Kingdom > >> On 29 Nov 2017, at 15:21, Christian Anthon <ant...@rth.dk> wrote: >> >> Hi, >> >> I have a problem with a newly setup slurm-17.02.7-1.el6.x86_64 that jobs >> seems to be stuck in ReqNodeNotAvail: >> >> 6982 panic Morgens ferro PD 0:00 1 >> (ReqNodeNotAvail, UnavailableNodes:) >> 6981 panic SPEC ferro PD 0:00 1 >> (ReqNodeNotAvail, UnavailableNodes:) >> >> The nodes are fully allocated in terms of memory, but not all cpu >> resources are consumed >> >> PARTITION AVAIL TIMELIMIT NODES STATE NODELIST >> _default up infinite 19 mix >> clone[05-11,25-29,31-32,36-37,39-40,45] >> _default up infinite 11 alloc alone[02-08,10-13] >> fastlane up infinite 19 mix >> clone[05-11,25-29,31-32,36-37,39-40,45] >> fastlane up infinite 11 alloc alone[02-08,10-13] >> panic up infinite 19 mix >> clone[05-11,25-29,31-32,36-37,39-40,45] >> panic up infinite 12 alloc alone[02-08,10-13,15] >> free* up infinite 19 mix >> clone[05-11,25-29,31-32,36-37,39-40,45] >> free* up infinite 11 alloc alone[02-08,10-13] >> >> Possibly relevant lines in slurm.conf (full slurm.conf attached) >> >> SchedulerType=sched/backfill >> SelectType=select/cons_res >> SelectTypeParameters=CR_CPU_Memory >> TaskPlugin=task/none >> FastSchedule=1 >> >> Any advice? >> >> Cheers, Christian. >> >> <slurm.conf> > >