Hello,
I'm
relatively new to administering slurm, so my apologies if I've
missed something obvious.
We
have nodes of 4 GPU and nodes of 8 GPU. I would like users to
be able to request a total number of GPUs they require. The
MPI software is not fussed how many nodes it spans.
I
had hoped requests such as these would work:
#SBATCH --gres=gpu:8
#SBATCH
--exclusive
#SBATCH
--nodes=1-2
However
as both "gres" (or an alternate workaround "mem") are per-node
resources rather than per-job this doesn't work -- a pair of
4-GPU boxes can never be chosen.
So
-- is there a way to do this right, or to fake it? Such jobs
should run on whatever appropriate hardware configuration is
first available. The submitted job script will then slightly
reconfigure our software configuration depending on the
hardware type it lands on, before launching via srun.
As
an alternative -- I note the "heterogeneous jobs" feature.
This allows jobs which require resources of "hardware config
A" AND "hardware config B". Is there anyway to request one
hardware configuration OR another?
I
can almost fake it for a single use-case with "constraints",
however this syntax doesn't seem understood by the parser
code:
--constraints=[grp1|grp2|grp3|grp4]&[gpuA*1&gpuB*1]
--nodes=1-2
With
example node configuration:
NodeName=small1
Gres=gpu:4 Feature=gpuA,grp1
NodeName=small2
Gres=gpu:4 Feature=gpuB,grp1
NodeName=small3 Gres=gpu:4 Feature=gpuB,grp2
NodeName=small4 Gres=gpu:4 Feature=gpuB,grp2
NodeName=big1 Gres=gpu:8 Feature=gpuA,gpuB,grp3
NodeName=big2 Gres=gpu:8 Feature=gpuA,gpuB,grp4
All ideas are appreciated.
Thanks,
Rob
Middleton.