Hello,I need your help. I have a user who is able to bypass his account Limit, and I can't figure out why.
We have two partitions with lower tier (all hosts) named "long", and higher tier (only one host) named "sfglab". Here is it's config:
PartitionName=long Nodes=dgx-[1-4],sr-[1-3] MaxTime=10-0 State=UP PriorityTier=10000 QOS=long OverSubscribe=NO DefaultTime=2-00:00:00 PreemptMode=suspend TRESBillingWeights=CPU=1,Mem=0.062G,GRES/gpu=72.458
PartitionName=sfglab Nodes=dgx-1 MaxTime=10-0 State=UP PriorityTier=20000 PreemptMode=off OverSubscribe=NO AllowAccounts=sfglab AllowGroups=sfglab TRESBillingWeights=CPU=1,Mem=0.062G,GRES/gpu=72.458 Hidden=NO
Partition long also has assigned QOS (named "long") with limit: cpu=450,gres/gpu=28,mem=6T
And also we have account (also named "sfglab") with limits:sacctmgr modify account where name=sfglab set GrpTRES=cpu=300,gres/gpu=20,mem=5T
As a result user is able to allocate more 32 GPUs... I don't understand why :( We use slurm 23.02.2. -- best regards | pozdrawiam serdecznie *Michał Kadlof*Head of the high performance computing center Kierownik ośrodka obliczeniowego HPC
Eden^N cluster administrator Administrator klastra obliczeniowego Eden^NStructural and Functional Genomics Laboratory Laboratorium Genomiki Strukturalnej i Funkcjonalnej Faculty of Mathematics and Computer Science Wydział Matematyki i Nauk Informacyjnych
Warsaw University of Technology Politechnika Warszawska
smime.p7s
Description: S/MIME Cryptographic Signature