This solution is even better.
I am actually using pestat for my (as admin) needs.
But I originally asked the question in order to enhance the ability of slurm_exporter which is a client side code for prometheus/grafana that export slurm statistics to be read as graphs.
for this need squeue -o %b is enough
But I am sure there is a need for pestat to print the gres info as well, you already atleast helping yair and myself.

Thanks, Nadav
On 13/02/2018 17:41, Ole Holm Nielsen wrote:
On 02/13/2018 08:13 AM, Nadav Toledo wrote:> Does anyone know of way to get amount of idle gpu per node or for all
cluster ?

sinfo -o %G gives the total amount of gres resource for each node. Is there a way to get the idle amount same as you can get for cpu (%C)?
Perhaps if one use lock file like /dev/nvidia# for each gpu you can check their states?

I think printing the GRES usage for nodes is a neat idea.  So I've added a flag "-G" to my pestat command so that the GRES usage for each job on each node is printed.  The squeue command can print GRES usage using -o %b.

Could you give pestat a try to see if it fits your needs:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat

Just run "pestat -G" on your Slurm cluster.

At the moment pestat doesn't print a column of total configured GRES in the node, but this could be added if there is interest.

Please send me feedback and comments about pestat.

/Ole


Reply via email to