We noticed that the slurm controller will remove nodes that it cannot reach. How can this be disabled? We would like to see the nodes marked down/drain instead of the controller removing the nodes from sinfo.
/var/log/slurm/slurmctld.log [2022-10-25T13:10:01.500] debug: Log file re-opened [2022-10-25T13:10:01.589] error: get_addr_info: getaddrinfo() failed: Temporary failure in name resolution [2022-10-25T13:10:01.589] error: slurm_set_addr: Unable to resolve "spg-ethx-f4ce" [2022-10-25T13:10:01.589] error: slurm_get_port: Address family '0' not supported [2022-10-25T13:10:01.589] error: _set_slurmd_addr: failure on spg-ethx-f4ce cat /etc/slurm/slurm.conf | grep -i f4ce NodeName=spg-ethx-f4ce ... PartitionName=debug spg-ethx-f4ce ... No output in sinfo: sinfo -N | grep f4ce sinfo -R | grep f4ce slurmd -V slurm 21.08.0