[slurm-users] Issue with Enforcing GPU Usage Limits in Slurm

2025-04-14 Thread lyz--- via slurm-users
Hi, I am currently encountering an issue with Slurm's GPU resource limitation. I have attempted to restrict the number of GPUs a user can utilize by executing the following command:​ sacctmgr modify user lyz set MaxTRES=gres/gpu=2 This command is intended to limit user 'lyz' to using a maximum of

[slurm-users] Re: [EXT] Re: Issue with Enforcing GPU Usage Limits in Slurm

2025-04-15 Thread lyz--- via slurm-users
Hi ! Christ. The cgroup.conf on my gpu node is as same as head node. The content are as follow: CgroupAutomount=yes ConstrainCores=yes ConstrainRAMSpace=yes ConstrainDevices=yes I'll try slurm of high version. -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an

[slurm-users] Re: [EXT] Re: Issue with Enforcing GPU Usage Limits in Slurm

2025-04-16 Thread lyz--- via slurm-users
Hi Chris! I didn't modify the cgroup configuration file; I only upgraded the Slurm version. After that, the limitations worked successfully. It's quite odd. lyz -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Issue with Enforcing GPU Usage Limits in Slurm

2025-04-15 Thread lyz--- via slurm-users
Hi, Christopher. Thank you for your reply. I have already modified the cgroup.conf configuration file in Slurm as follows: vim /etc/slurm/cgroup.conf # # Slurm cgroup support configuration file # # See man slurm.conf and man cgroup.conf for further # information on cgroup configuration parameters

[slurm-users] Re: [EXT] Re: Issue with Enforcing GPU Usage Limits in Slurm

2025-04-15 Thread lyz--- via slurm-users
Hi, Sean. It's the latest slurm version. [root@head1 ~]# sinfo --version slurm 22.05.3 And this is my content of the gres.conf in gpu node. # This section of this file was automatically generated by cmd. Do not edit manually! # BEGIN AUTOGENERATED SECTION -- DO NOT REMOVE Name=gpu File=/dev/nvidi

[slurm-users] Re: [EXT] Re: Issue with Enforcing GPU Usage Limits in Slurm

2025-04-15 Thread lyz--- via slurm-users
Hi, Christ. Thank you for continuing paying attention to this issue. I followed your instuction. And This is the output: [root@head1 ~]# systemctl cat slurmd | fgrep Delegate Delegate=yes lyz -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-user

[slurm-users] Re: [EXT] Re: Issue with Enforcing GPU Usage Limits in Slurm

2025-04-16 Thread lyz--- via slurm-users
Hi! Christ. Thank you again for your instruction. I've tried version 23.11.10. It does work. When I ran the script using the following command, it successfully restricted the usage to the specified CUDA devices: srun -p gpu --gres=gpu:2 -nodelist=node11 python test.py And when I checked the GPU

[slurm-users] Re: [EXT] Re: Issue with Enforcing GPU Usage Limits in Slurm

2025-04-15 Thread lyz--- via slurm-users
Hi, Sean. I followed your instructions and added ConstrainDevices=yes to the /etc/slurm/cgroup.conf file on the server node, and then restarted the relevant services on both the server and the client. However, I still can't enforce the restriction in the Python program. It seems like the restric