date:20240411

[slurm-users] Re: Slurmd enabled crash with CgroupV2

2024-04-11 Thread Williams, Jenny Avis via slurm-users

The end goal is to see the following 2 things - jobs under the slurmstepd cgroup path, and the cpu,cpuset,memory at least in the cgroup.controllers file within the jobs cgroups.controller list. The pattern you have would be the processes left after boot, first failed slurmd service start which l

[slurm-users] Re: Slurmd enabled crash with CgroupV2

2024-04-11 Thread Josef Dvoracek via slurm-users

thanks for hint. so you end with two "slurmstepd infinity" processes like me when I tried this workaround? [root@node ~]# ps aux | grep slurm root 1833 0.0 0.0 33716 2188 ? Ss 21:02 0:00 /usr/sbin/slurmstepd infinity root 2259 0.0 0.0 236796 12108 ? Ss

[slurm-users] Re: Slurmd enabled crash with CgroupV2

2024-04-11 Thread Williams, Jenny Avis via slurm-users

There needs to be a slurmstepd infinity process running before slurmd starts. This doc goes into it: https://slurm.schedmd.com/cgroup_v2.html Probably a better way to do this, but this is what we do to deal with that: :: files/slurm-cgrepair.service :: [Unit] Before=slurmd

[slurm-users] Re: Slurmd enabled crash with CgroupV2

2024-04-11 Thread Josef Dvoracek via slurm-users

I observe same behavior on slurm 23.11.5 Rocky Linux8.9.. > [root@compute ~]# cat /sys/fs/cgroup/cgroup.subtree_control > memory pids > [root@compute ~]# systemctl disable slurmd > Removed /etc/systemd/system/multi-user.target.wants/slurmd.service. > [root@compute ~]# cat /sys/fs/cgroup/cgroup.su

[slurm-users] Re: Slurmd enabled crash with CgroupV2

[slurm-users] Re: Slurmd enabled crash with CgroupV2

[slurm-users] Re: Slurmd enabled crash with CgroupV2

[slurm-users] Re: Slurmd enabled crash with CgroupV2

4 matches

Site Navigation

Mail list logo

Footer information