You may want to look at your resources. If the memory allocation adds up such that there isn't enough left for any job to run, it won't matter that there are still GPUs available.

Similar for any other resource (CPUs, cores, etc)

Brian Andrus


On 8/10/2021 8:07 AM, Jack Chen wrote:
Does anyone have any ideas on this?

On Fri, Aug 6, 2021 at 2:52 PM Jack Chen <scs...@gmail.com <mailto:scs...@gmail.com>> wrote:

    I'm using slurm15.08.11, when I submit several 1 gpu jobs, slurm
    doesn't allocate nodes using compact strategy. Anyone know how to
    solve this? Will upgrading slurm latest version helpĀ ?

    For example, there are two nodes A and B with 8 gpus per node, I
    submitted 8 1 gpu jobs, slurm will allocate first 6 jobs on node
    A, then last 2 jobs on node B. Then when I submit one job with 8
    gpus, it will pending because of gpu fragments: nodes A has 2 idle
    gpus, node b 6 idle gpus

    Thanks in advance!

Reply via email to