Re: [slurm-users] [EXT] --mem is not limiting the job's memory

2023-06-22 Thread Sean Crosby
On the worker node, check if cgroups are mounted grep cgroup /proc/mounts (normally it's in /sys/fs/cgroup ) then check if Slurm is setting up the cgroup find /sys/fs/cgroup | grep slurm e.g. [root@spartan-gpgpu164 ~]# find /sys/fs/cgroup/memory | grep slurm /sys/fs/cgroup/memory/slurm /sys/f

Re: [slurm-users] [EXT] --mem is not limiting the job's memory

2023-06-22 Thread Boris Yazlovitsky
it's still not constraining memory... a memhog job continues to memhog: boris@rod:~/scripts$ sacct --starttime=2023-05-01 --format=jobid,user,start,elapsed,reqmem,maxrss,maxvmsize,nodelist,state,exit -j 199 JobID User StartElapsed ReqMem MaxRSS MaxVMSize

Re: [slurm-users] [EXT] --mem is not limiting the job's memory

2023-06-22 Thread Ozeryan, Vladimir
No worries, No, we don’t have any OS level settings, only “allowed_devices.conf” which just has /dev/random, /dev/tty and stuff like that. But I think this could be the culprit, check out man page for cgroup.conf AllowedRAMSpace=100 I would just leave these four: CgroupAutomount=yes ConstrainCor

Re: [slurm-users] [EXT] --mem is not limiting the job's memory

2023-06-22 Thread Boris Yazlovitsky
thank you Vlad - looks like we have the same yes's Do you remember if you had to make any settings on the OS level or in the kernel to make it work? -b On Thu, Jun 22, 2023 at 5:31 PM Ozeryan, Vladimir < vladimir.ozer...@jhuapl.edu> wrote: > Hello, > > > > We have the following configured and it

Re: [slurm-users] [EXT] --mem is not limiting the job's memory

2023-06-22 Thread Ozeryan, Vladimir
Hello, We have the following configured and it seems to be working ok. CgroupAutomount=yes ConstrainCores=yes ConstrainDevices=yes ConstrainRAMSpace=yes Vlad. From: slurm-users On Behalf Of Boris Yazlovitsky Sent: Thursday, June 22, 2023 4:50 PM To: Slurm User Community List Subject: Re: [sl

Re: [slurm-users] [EXT] --mem is not limiting the job's memory

2023-06-22 Thread Boris Yazlovitsky
Hello Vladimir, thank you for your response. this is the cgroups.conf file: CgroupAutomount=yes ConstrainCores=yes ConstrainDevices=yes ConstrainRAMSpace=yes ConstrainSwapSpace=yes MaxRAMPercent=90 AllowedSwapSpace=0 AllowedRAMSpace=100 MemorySwappiness=0 MaxSwapPercent=0 /etc/default/grub: GRUB_

Re: [slurm-users] [EXT] --mem is not limiting the job's memory

2023-06-22 Thread Ozeryan, Vladimir
--mem=5G. Should allocate 5G of memory per node. Are your cgroups configured? From: slurm-users On Behalf Of Boris Yazlovitsky Sent: Thursday, June 22, 2023 3:28 PM To: slurm-users@lists.schedmd.com Subject: [EXT] [slurm-users] --mem is not limiting the job's memory APL external email warning:

[slurm-users] --mem is not limiting the job's memory

2023-06-22 Thread Boris Yazlovitsky
Running slurm 22.03.02 on Ubunutu 22.04 server. Jobs submitted with --mem=5g are able to allocate an unlimited amount of memory. how to limit on the job submission level how much memory it can grab? thanks, and best regards! Boris