Did Diego's suggestion from [1] not help narrow things down? [1] https://lists.schedmd.com/pipermail/slurm-users/2021-August/007708.html
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Jack Chen <scs...@gmail.com> Date: Tuesday, August 10, 2021 at 10:08 AM To: Slurm User Community List <slurm-users@lists.schedmd.com> Subject: Re: [slurm-users] Compact scheduling strategy for small GPU jobs External Email Warning This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests. ________________________________ Does anyone have any ideas on this? On Fri, Aug 6, 2021 at 2:52 PM Jack Chen <scs...@gmail.com<mailto:scs...@gmail.com>> wrote: I'm using slurm15.08.11, when I submit several 1 gpu jobs, slurm doesn't allocate nodes using compact strategy. Anyone know how to solve this? Will upgrading slurm latest version help ? For example, there are two nodes A and B with 8 gpus per node, I submitted 8 1 gpu jobs, slurm will allocate first 6 jobs on node A, then last 2 jobs on node B. Then when I submit one job with 8 gpus, it will pending because of gpu fragments: nodes A has 2 idle gpus, node b 6 idle gpus Thanks in advance!