[slurm-users] Optimizing CPU socket affinities and NVLink

2024-08-08 Thread Matthew R. Baney via slurm-users
Hello, I've recently adopted setting AutoDetect=nvml in our GPU nodes' gres.conf files to automatically populate Cores and Links for GPUs, which has been working well. I'm now wondering if I can prioritize having single GPU jobs scheduled on NVLink pairs (these are PCIe A6000s) where one of the G

[slurm-users] Integrate podman to run containers

2024-08-08 Thread amin.raeiszadeh--- via slurm-users
Hello, I have a test cluster consist of two nodes, one as controller and the other as compute node. I followed all the steps from slurm documentation and I want to run jobs as containers but I get the following error when running podman run hello-world on controller node: time="2024-08-06T12:02