Hi Jason,
We use the Slurm tool "pestat" (Processor Element status) available from
[1] for all kinds of cluster monitoring, including GPU usage. An example
usage is:
$ pestat -G -p a100
GPU GRES (Generic Resource) is printed after each JobID
Print only nodes in partition a100
Hostname
Hello all,
Apologies for the basic question, but is there a straightforward,
best-accepted method for using Slurm to report on which GPUs are currently
in use? I've done some searching and people recommend all sorts of methods,
including parsing the output of nvidia-smi (seems inefficient, especia