Hi Carsten,
thank you very much for pointing me in the right direction. I think this is what I'm looking for and I will try it out.
Best regards
Peter
29.07.2021, 15:13, "Carsten Beyer" <be...@dkrz.de>:
Hi Peter,
you could create a reservation with scontrol and put only the root user
or any other testuser(s) in the 'users' section, e.g.
scontrol create reservation=test nodes=<nodename> starttime=<starttime>
duration=<how-long> users=root
Then you need to put the reservation name to your sbatch definition or
commandline for your testjob.
Cheers,
Carsten
Am 29.07.2021 um 11:59 schrieb Peter Schmidt:Hello everyone,
I have a Slurm GPU cluster that I'm administrating and from time to
time I need to run test jobs. The issue is that my users allocate all
GPUs as soon as they become available, which makes testing for me
impossible.
I could drain a node and wait until all jobs are finished, but as soon
as I enable it again to run my test job the queued user jobs will be
scheduled for that node. Is there a way to drain a node and still be
able to schedule jobs for select users (e.g. root)? I currently don't
have any kind of priority system in my cluster.
Thank you in advance and best regards--
Carsten Beyer
Abteilung Systeme
Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45a * D-20146 Hamburg * Germany
Phone: +49 40 460094-221
Fax: +49 40 460094-270
Email: be...@dkrz.de
URL: http://www.dkrz.de
Geschäftsführer: Prof. Dr. Thomas Ludwig
Sitz der Gesellschaft: Hamburg
Amtsgericht Hamburg HRB 39784