Hi all,
I'm seeing some odd behavior when using the --mem-per-gpu flag instead of
the --mem flag to request memory when also requesting all available CPUs on
a node (in this example, all available nodes have 32 CPUs):
$ srun --ntasks-per-node=8 --cpus-per-task=4 --gpus-per-node=gtx1080ti:1
--mem-
Hello,
I've recently adopted setting AutoDetect=nvml in our GPU nodes' gres.conf
files to automatically populate Cores and Links for GPUs, which has been
working well.
I'm now wondering if I can prioritize having single GPU jobs scheduled on
NVLink pairs (these are PCIe A6000s) where one of the G
Hello Slurm users,
I'm trying to write a check in our job_submit.lua script that enforces
relative resource requirements such as disallowing more than 4 CPUs or 48GB
of memory per GPU. The QOS itself has a MaxTRESPerJob of
cpu=32,gres/gpu=8,mem=384G (roughly one full node), but we're looking to
pr
Hello,
I have noticed that jobs submitted to non-preemptable partitions
(PreemptType = preempt/partition_prio and PreemptMode = REQUEUE) under
accounts with GrpTRES limits will become pending with AssocGrpGRES as the
reason when the account is up against the relevant limit, even when there
are oth