subject:"\[slurm\-users\] cgroup limits not created for jobs"

Re: [slurm-users] cgroup limits not created for jobs

2020-07-26 Thread Christopher Samuel

On 7/26/20 12:21 pm, Paul Raines wrote: Thank you so much. This also explains my GPU CUDA_VISIBLE_DEVICES missing problem in my previous post. I've missed that, but yes, that would do it. As a new SLURM admin, I am a bit suprised at this default behavior. Seems like a way for users to game

Re: [slurm-users] cgroup limits not created for jobs

2020-07-26 Thread Paul Raines

On Sat, 25 Jul 2020 2:00am, Chris Samuel wrote: On Friday, 24 July 2020 9:48:35 AM PDT Paul Raines wrote: But when I run a job on the node it runs I can find no evidence in cgroups of any limits being set Example job: mlscgpu1[0]:~$ salloc -n1 -c3 -p batch --gres=gpu:quadro_rtx_6000:1 --me

Re: [slurm-users] cgroup limits not created for jobs

2020-07-24 Thread Chris Samuel

On Friday, 24 July 2020 9:48:35 AM PDT Paul Raines wrote: > But when I run a job on the node it runs I can find no > evidence in cgroups of any limits being set > > Example job: > > mlscgpu1[0]:~$ salloc -n1 -c3 -p batch --gres=gpu:quadro_rtx_6000:1 --mem=1G > salloc: Granted job allocation 17 >

[slurm-users] cgroup limits not created for jobs

2020-07-24 Thread Paul Raines

I am not seeing any cgroup limits being put in place on the nodes when jobs run. I have slurm 20.02 running on CentOS 7.8 In slurm.conf I have ProctrackType=proctrack/cgroup TaskPlugin=task/affinity,task/cgroup SelectTypeParameters=CR_Core_Memory JobAcctGatherType=jobacct_gather/cgroup and c