Hi Matt,
How about this sinfo command:
$ sinfo -O NodeList:30,Features:30,StateLong
NODELIST AVAIL_FEATURES STATE
i023 xeon2650v2,infiniband,xeon16 draining@
i[004-022,024-050] xeon2650v2,infiniband,xeon16 allocated
x[001-192] xeon2650v4,opa,xeon24 allocated
a[001-128] xeon6242r,opa,xeon40 allocated
b[001-012],c[001-196] xeon6148v5,opa,xeon40 allocated
You can grep for the desired features and use the nodelist in column 1 for
further processing.
/Ole
On 10/14/21 2:44 PM, Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND
APPLICATIONS INC] wrote:
All,
I work on a cluster that uses SLURM which has various types of nodes that
are are controlled via --constraint flags in sbatch.
Now, I started thinking "How can I figure out how many jobs are
running/pending/etc on a certain type of node?". I first thought obviously
"squeue --constraint=foo", but...nope. No --constraint flag with squeue.
Okay. Constraints are just Features by another name, but...you can't seem
to just squeue a feature either.
I asked a SLURM guru here and they suggested using --nodelist/-w a la:
squeue -a -w nodea[001-100],nodeb[001-100],... -t r
where you pass in all the nodes of a certain type. And, yep, that works!
But that also means I have to know what nodes are what type. I could
obviously do a one-time parsing of "scontrol show nodes" and see what each
chunk is and be done with it...but dangit I'm lazy and SLURM has so many
programs and options there might just be something and I haven't read the
right manpage! :)
So I was wondering if anyone out there knows of a cool/elegant/efficient
way of doing this?
Thanks,
Matt
PS: I still might write a bash script where I've listed what the node
names are of constraint and realize I might have to update it once every
year or two. Now time to look at what parser SLURM uses for nodelist. Can
you use regexes and use *, etc? Or just use nodea[001-100]? Time to find out!