On 4/14/25 6:27 am, lyz--- via slurm-users wrote:
This command is intended to limit user 'lyz' to using a maximum of 2 GPUs. However, when the user submits a job using srun, specifying CUDA 0, 1, 2, and 3 in the job script, or os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2,3", the job still utilizes all 4 GPUs during execution. This indicates that the GPU usage limit is not being enforced as expected. How can I resolve this situation.
You need to make sure you're using cgroups to control access to devices for tasks, a starting point for reading up on this is here:
https://slurm.schedmd.com/cgroups.html Good luck! All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com