That was it. Thank you very much.
From: slurm-users on behalf of Stephen
Cousins
Reply to: Slurm User Community List
Date: Tuesday, 16 November 2021 at 19:53
To: Slurm User Community List
Subject: Re: [slurm-users] Unable to start slurmd service
scontrol update nodename=name-of-node state
On 11/16/21 7:07 am, Jaep Emmanuel wrote:
> root@ecpsc10:~# scontrol show node ecpsc10
[...]
>State=DOWN ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
[...]
Reason=Node unexpectedly rebooted [slurm@2021-11-16T14:41:04]
This is why the node isn't considered available, as o
*From: *slurm-users on behalf of
> Stephen Cousins
> *Reply to: *Slurm User Community List
> *Date: *Tuesday, 16 November 2021 at 19:09
> *To: *Slurm User Community List
> *Subject: *Re: [slurm-users] Unable to start slurmd service
>
>
>
> I think you just need to use sco
Community List
Subject: Re: [slurm-users] Unable to start slurmd service
I think you just need to use scontrol to "resume" that node.
On Tue, Nov 16, 2021, 10:10 AM Jaep Emmanuel
mailto:emmanuel.j...@epfl.ch>> wrote:
Hi,
It might be a newbie question since I'm new to slurm.
I
I think you just need to use scontrol to "resume" that node.
On Tue, Nov 16, 2021, 10:10 AM Jaep Emmanuel wrote:
> Hi,
>
>
>
> It might be a newbie question since I'm new to slurm.
>
> I'm trying to restart the slurmd service on one of our Ubuntu box.
>
>
>
> The slurmd.service is defined by:
>
rFilesystem NONE plugin loaded
check if firewalld is enable
No
From: slurm-users on behalf of Hadrian
Djohari
Reply to: Slurm User Community List
Date: Tuesday, 16 November 2021 at 16:56
To: Slurm User Community List
Subject: Re: [slurm-users] Unable to start slurmd service
There can be few
There can be few possibilities:
1. Check if munge is working properly. From the scheduler master run
"munge -n | ssh ecpsc10 unmunge"
2. Check if selinux is enforced
3. Check if firewalld or similar firewall is enabled
4. Check the logs /var/log/slurm/slurmctld.log or slurmd.log on