Dear Slurm User list,

using https://slurm.schedmd.com/power_save.html we had one case out of
many (>242) node starts that resulted in

|slurm_update error: Invalid node state specified|

when we called:

|scontrol update NodeName="$1" state=RESUME reason=FailedStartup|

in the Fail script. We run this to make 100% sure that the instances -
that are created on demand - are again `~idle` after being removed by
the fail program. They are set to RESUME before the actual instance gets
destroyed. I remember that I had this case manually before, but I don't
remember when it occurs.

Maybe someone has a great idea how to tackle this problem.

Best regards
Xaver Stiensmeier

Reply via email to