That 'not responding' is the issue and usually means 1 of 2 things:
1) slurmd is not running on the node
2) something on the network is stopping the communication between the
node and the master (firewall, selinux, congestion, bad nic, routes, etc)
Brian Andrus
On 7/30/2021 3:51 PM, Soichi Ha
Brian,
Thank you for your reply and thanks for setting the email title. I forgot
to edit it before I sent it!
I am not sure how I can reply to your your reply.. but I hope this make it
so the right place..
I've updated slurm.conf to increase the controller debug level
> SlurmctldDebug=5
I now s