Hello,

We have set up "configless slurm" by passing a "conf-server" argument to
slurmd on all nodes. More details here:
https://slurm.schedmd.com/configless_slurm.html

one of the nodes is not able to pick up the configuration:


*>srun -w slurm-bm-70 --pty bash*




*srun: error: fwd_tree_thread: can't find address for host slurm-bm-70,
check slurm.confsrun: error: Task launch for 402011.0 failed on node
slurm-bm-70: Can't find an address, check slurm.confsrun: error:
Application launch failed: Can't find an address, check slurm.confsrun: Job
step aborted: Waiting up to 32 seconds for job step to finish.srun: error:
Timed out waiting for job step to complete*

This is limited to this one node only. Do you know how to fix this? I
already tried restarting the slurmd service on this node.

Thanks,
Durai

Reply via email to