Odds are the backfill loop is not penetrating far enough into the
queue. Recall that slurm has two scheduling loops. The primary is the
faster one that only penetrates as far as it can schedule. Thus in this
case the primary loop would stop immediately on the GPU jobs that it
can't schedule.
Hello!
We are having an issue with high priority gpu jobs blocking low priority cpu
only jobs.
Our cluster is setup with one partition, "all". All nodes reside in this
cluster. In this all partition we have four generations of compute nodes,
including gpu nodes. We do this to make use of those