Re: [slurm-users] Pending with resource problems

Henkel, Andreas Wed, 17 Apr 2019 08:03:23 -0700

I think there isn’t enough memory.
AllocTres Shows mem=55G
And your job wants another 40G although the node only has 63G in total.
Best,
Andreas


Am 17.04.2019 um 16:45 schrieb Mahmood Naderan 
<mahmood...@gmail.com<mailto:mahmood...@gmail.com>>:

Hi,
Although it was fine for previous job runs, the following script now stuck as 
PD with the reason about resources.

$ cat slurm_script.sh
#!/bin/bash
#SBATCH --output=test.out
#SBATCH --job-name=g09-test
#SBATCH --ntasks=20
#SBATCH --nodelist=compute-0-0
#SBATCH --mem=40GB
#SBATCH --account=z7
#SBATCH --partition=EMERALD
g09 test.gjf
$ sbatch slurm_script.sh
Submitted batch job 878
$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES 
NODELIST(REASON)
               878   EMERALD g09-test shakerza PD       0:00      1 (Resources)



However, all things look good.

$ sacctmgr list association format=user,account,partition,grptres%20 | grep 
shaker
shakerzad+      local
shakerzad+         z7    emerald       cpu=20,mem=40G
$ scontrol show node compute-0-0
NodeName=compute-0-0 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=9 CPUTot=32 CPULoad=8.89
   AvailableFeatures=rack-0,32CPUs
   ActiveFeatures=rack-0,32CPUs
   Gres=(null)
   NodeAddr=10.1.1.254 NodeHostName=compute-0-0 Version=18.08
   OS=Linux 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017
   RealMemory=64261 AllocMem=56320 FreeMem=37715 Sockets=32 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=444124 Weight=20511900 Owner=N/A 
MCS_label=N/A
   Partitions=CLUSTER,WHEEL,EMERALD,QUARTZ
   BootTime=2019-04-06T10:03:47 SlurmdStartTime=2019-04-06T10:05:54
   CfgTRES=cpu=32,mem=64261M,billing=47
   AllocTRES=cpu=9,mem=55G
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


Any idea?

Regards,
Mahmood

Re: [slurm-users] Pending with resource problems

Reply via email to