On 7/26/20 12:21 pm, Paul Raines wrote:
Thank you so much. This also explains my GPU CUDA_VISIBLE_DEVICES missing
problem in my previous post.
I've missed that, but yes, that would do it.
As a new SLURM admin, I am a bit suprised at this default behavior.
Seems like a way for users to game
On Sat, 25 Jul 2020 2:00am, Chris Samuel wrote:
On Friday, 24 July 2020 9:48:35 AM PDT Paul Raines wrote:
But when I run a job on the node it runs I can find no
evidence in cgroups of any limits being set
Example job:
mlscgpu1[0]:~$ salloc -n1 -c3 -p batch --gres=gpu:quadro_rtx_6000:1 --me
On Friday, 24 July 2020 9:48:35 AM PDT Paul Raines wrote:
> But when I run a job on the node it runs I can find no
> evidence in cgroups of any limits being set
>
> Example job:
>
> mlscgpu1[0]:~$ salloc -n1 -c3 -p batch --gres=gpu:quadro_rtx_6000:1 --mem=1G
> salloc: Granted job allocation 17
>
I am not seeing any cgroup limits being put in place on the nodes
when jobs run. I have slurm 20.02 running on CentOS 7.8
In slurm.conf I have
ProctrackType=proctrack/cgroup
TaskPlugin=task/affinity,task/cgroup
SelectTypeParameters=CR_Core_Memory
JobAcctGatherType=jobacct_gather/cgroup
and c