On Wednesday, 21 March 2018, at 12:05:49 (+0100), Alexis Huxley wrote: > > >Depending on the load on the scheduler, this can be slow. Is there > > >faster way? Perhaps one that doesn't involve communicating with > > >the scheduler node? Thanks! > > Thanks for the suggestion Ole, but we have something in place that > we don't want to change at this time. We just need a faster way > for a node to get its own status.
As you can see from https://github.com/mej/nhc/blob/master/helpers/node-mark-offline#L55 starting at line #61, NHC uses "sinfo -o '%t %E' -hn $HOSTNAME" to get the current node's status. I've confirmed with Moe that this is the Right Way(tm) to do this with SLURM and that any resulting hangs, loops, or deadlocks would be considered bugs by SchedMD/Moe and fixed accordingly. :-) I have not spoken to him specifically about querying scontrol for information -- NHC only uses scontrol to alter node state -- but I would imagine the same would apply there. It's all done via RPC to slurmctld anyway! Michael -- Michael E. Jennings <m...@lanl.gov> HPC Systems Team, Los Alamos National Laboratory Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605