On the worker node, check if cgroups are mounted
grep cgroup /proc/mounts
(normally it's in /sys/fs/cgroup )
then check if Slurm is setting up the cgroup
find /sys/fs/cgroup | grep slurm
e.g.
[root@spartan-gpgpu164 ~]# find /sys/fs/cgroup/memory | grep slurm
/sys/fs/cgroup/memory/slurm
/sys/f
it's still not constraining memory...
a memhog job continues to memhog:
boris@rod:~/scripts$ sacct --starttime=2023-05-01
--format=jobid,user,start,elapsed,reqmem,maxrss,maxvmsize,nodelist,state,exit
-j 199
JobID User StartElapsed ReqMem MaxRSS
MaxVMSize
No worries,
No, we don’t have any OS level settings, only “allowed_devices.conf” which just
has /dev/random, /dev/tty and stuff like that.
But I think this could be the culprit, check out man page for cgroup.conf
AllowedRAMSpace=100
I would just leave these four:
CgroupAutomount=yes
ConstrainCor
thank you Vlad - looks like we have the same yes's
Do you remember if you had to make any settings on the OS level or in the
kernel to make it work?
-b
On Thu, Jun 22, 2023 at 5:31 PM Ozeryan, Vladimir <
vladimir.ozer...@jhuapl.edu> wrote:
> Hello,
>
>
>
> We have the following configured and it
Hello,
We have the following configured and it seems to be working ok.
CgroupAutomount=yes
ConstrainCores=yes
ConstrainDevices=yes
ConstrainRAMSpace=yes
Vlad.
From: slurm-users On Behalf Of Boris
Yazlovitsky
Sent: Thursday, June 22, 2023 4:50 PM
To: Slurm User Community List
Subject: Re: [sl
Hello Vladimir, thank you for your response.
this is the cgroups.conf file:
CgroupAutomount=yes
ConstrainCores=yes
ConstrainDevices=yes
ConstrainRAMSpace=yes
ConstrainSwapSpace=yes
MaxRAMPercent=90
AllowedSwapSpace=0
AllowedRAMSpace=100
MemorySwappiness=0
MaxSwapPercent=0
/etc/default/grub:
GRUB_
--mem=5G. Should allocate 5G of memory per node.
Are your cgroups configured?
From: slurm-users On Behalf Of Boris
Yazlovitsky
Sent: Thursday, June 22, 2023 3:28 PM
To: slurm-users@lists.schedmd.com
Subject: [EXT] [slurm-users] --mem is not limiting the job's memory
APL external email warning:
Running slurm 22.03.02 on Ubunutu 22.04 server.
Jobs submitted with --mem=5g are able to allocate an unlimited amount of
memory.
how to limit on the job submission level how much memory it can grab?
thanks, and best regards!
Boris