[slurm-users] Troubles with cgroups

Hermann Schwärzler Thu, 16 Mar 2023 07:56:36 -0700

Dear Slurm users,

after opening our new cluster (62 nodes - 250 GB RAM, 64 cores each -Rocky Linux 8.6 - Kernel 4.18.0-372.16.1.el8_6.0.1 - Slurm 22.05) for"friendly user" test operation about 6 weeks ago we were soon facingserious problems with nodes that suddenly become unresponsive (so muchso that only a hard reboot via IPMI gets them back).


We were able to narrow the problem down to one similar to this one:
https://github.com/apptainer/singularity/issues/5850

Although in our case it's not related to Singularity but generally tocgroups.

We are using cgroups in our Slurm configuration to limit RAM, CPUs anddevices. In the beginning we did *not* limit swap space (we are doing sonow to work around the problem but would like to allow at least someswap space).

We are able to reproduce the problem *outside* Slurm as well by usingthe small test program mentioned in the above Singularity GitHub-issue(https://gist.github.com/pja237/b0e9a49be64a20ad1af905305487d41a) withthese steps (for cgroups/v1):


cd /sys/fs/cgroup/memory
mkdir test
cd test
echo $((5*1024*1024*1024)) > memory.limit_in_bytes
echo $$ > cgroup.procs
/path/to/mempoc 2 10

After about 10 to 30 minutes the problem occurs.

We tried to switch to cgroups/v2. Which does solve the problem for themanual case outside Slurm:


cd /sys/fs/cgroup
mkdir test
cd test
echo "+memory" > cgroup.subtree_control
mkdir test2
echo $((5*1024*1024*1024)) > test2/memory.high
echo $$ > test2/cgroup.procs
/path/to/mempoc 2 10

Now it runs for days and weeks without any issues!

But when we run the same thing in Slurm (with cgroups/v2 configured to*not limit* swapping) by using


sbatch --mem=5G --cpus-per-task=10 \
       --wrap "/path/to/mempoc 2 10"

the nodes still become unusable after some time (1 to 5 hours) with theusual symptoms.


Did anyone of you face similar issues?
Are we missing something?

Is it unreasonable to think our systems should stay stable even whenthere is cgroup-based swapping?


Kind regards,
Hermann

[slurm-users] Troubles with cgroups

Reply via email to