Re: [slurm-users] Unable to start slurmd service

2021-11-16 Thread Jaep Emmanuel
That was it. Thank you very much. From: slurm-users on behalf of Stephen Cousins Reply to: Slurm User Community List Date: Tuesday, 16 November 2021 at 19:53 To: Slurm User Community List Subject: Re: [slurm-users] Unable to start slurmd service scontrol update nodename=name-of-node state

Re: [slurm-users] Unable to start slurmd service

2021-11-16 Thread Christopher Samuel
On 11/16/21 7:07 am, Jaep Emmanuel wrote: > root@ecpsc10:~# scontrol show node ecpsc10 [...] >State=DOWN ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A [...]    Reason=Node unexpectedly rebooted [slurm@2021-11-16T14:41:04] This is why the node isn't considered available, as o

Re: [slurm-users] Unable to start slurmd service

2021-11-16 Thread Stephen Cousins
*From: *slurm-users on behalf of > Stephen Cousins > *Reply to: *Slurm User Community List > *Date: *Tuesday, 16 November 2021 at 19:09 > *To: *Slurm User Community List > *Subject: *Re: [slurm-users] Unable to start slurmd service > > > > I think you just need to use sco

Re: [slurm-users] Unable to start slurmd service

2021-11-16 Thread Jaep Emmanuel
Community List Subject: Re: [slurm-users] Unable to start slurmd service I think you just need to use scontrol to "resume" that node. On Tue, Nov 16, 2021, 10:10 AM Jaep Emmanuel mailto:emmanuel.j...@epfl.ch>> wrote: Hi, It might be a newbie question since I'm new to slurm. I&#x

Re: [slurm-users] Unable to start slurmd service

2021-11-16 Thread Stephen Cousins
I think you just need to use scontrol to "resume" that node. On Tue, Nov 16, 2021, 10:10 AM Jaep Emmanuel wrote: > Hi, > > > > It might be a newbie question since I'm new to slurm. > > I'm trying to restart the slurmd service on one of our Ubuntu box. > > > > The slurmd.service is defined by: >

Re: [slurm-users] Unable to start slurmd service

2021-11-16 Thread Jaep Emmanuel
rFilesystem NONE plugin loaded check if firewalld is enable No From: slurm-users on behalf of Hadrian Djohari Reply to: Slurm User Community List Date: Tuesday, 16 November 2021 at 16:56 To: Slurm User Community List Subject: Re: [slurm-users] Unable to start slurmd service There can be few

Re: [slurm-users] Unable to start slurmd service

2021-11-16 Thread Hadrian Djohari
There can be few possibilities: 1. Check if munge is working properly. From the scheduler master run "munge -n | ssh ecpsc10 unmunge" 2. Check if selinux is enforced 3. Check if firewalld or similar firewall is enabled 4. Check the logs /var/log/slurm/slurmctld.log or slurmd.log on