On 02/13/2018 08:13 AM, Nadav Toledo wrote:> Does anyone know of way to
get amount of idle gpu per node or for all
cluster ?
sinfo -o %G gives the total amount of gres resource for each node. Is
there a way to get the idle amount same as you can get for cpu (%C)?
Perhaps if one use lock file like /dev/nvidia# for each gpu you can
check their states?
I think printing the GRES usage for nodes is a neat idea. So I've added
a flag "-G" to my pestat command so that the GRES usage for each job on
each node is printed. The squeue command can print GRES usage using -o %b.
Could you give pestat a try to see if it fits your needs:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat
Just run "pestat -G" on your Slurm cluster.
At the moment pestat doesn't print a column of total configured GRES in
the node, but this could be added if there is interest.
Please send me feedback and comments about pestat.
/Ole