Hi Herbert,

just like Angelos described, we also have logic in our poweroff script that 
checks if the node is really IDLE and only sends the poweroff command if that's 
the case.

Excerpt:
hosts=$(scontrol show hostnames $1)
for host in $hosts; do
        scontrol show node $host | tr ' ' '\n' | grep -q 'State=IDLE+POWER$'
        if [[ $? == 1 ]]; then
                echo "node $host NOT IDLE" >>$OUTFILE
                continue
        else
                echo "node $host IDLE" >>$OUTFILE
        fi
        ssh $host poweroff
        ...
        sleep 1
        ...
done

Best,
Florian

________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of 
Steininger, Herbert <herbert_steinin...@psych.mpg.de>
Sent: Monday, 24 August 2020 10:52
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: [External] [slurm-users] [slurm 20.02.3] don't suspend nodes in down 
state

Hi,

how can I prevent slurm, to suspend nodes, which I have set to down state for 
maintenance?
I know about "SuspendExcNodes", but this doesn't seem the right way, to roll 
out the slurm.conf every time this changes.
Is there a state that I can set so that the nodes doesn't get suspended?

It happened a few times that I was doing some stuff on a server and after our 
idle time (1h) slurm decided to suspend the node.

TIA,
Herbert

--
Herbert Steininger
Leiter EDV & HPC
Administrator
Max-Planck-Institut für Psychiatrie
Kraepelinstr.  2-10
80804 München
Tel      +49 (0)89 / 30622-368
Mail   herbert_steinin...@psych.mpg.de
Web  https://www.psych.mpg.de



Reply via email to