Hi Ahmet,
Another way to do it! Many thanks - very useful :)
But does anyone know why the a user association with my qos stopped jobs
running with InvalidQOS?
I can imagine using a user qos to override a partition qos being useful
for other things, so would be nice to know what I've done wrong.
Best,
Mark
On Wed, 1 Apr 2020, mercan wrote:
Hi;
If you have working job_submit.lua script, you can put a block new jobs of
the spesific user:
if job_desc.user_name == "baduser" then
return 2045
end
thats all!
Regards;
Ahmet M.
1.04.2020 16:22 tarihinde Mark Dixon yazdı:
Hi David,
Thanks for this, it sounds like I've not been trying crazy methods - but
they don't work for me:
- "sacctmgr modify user foo set qos=drain" did set up the association
("sacctmgr show associations" showed that QoS changed from "normal" to
"drain"), but this is when foo's jobs refused to start because of reason
"InvalidQOS".
- "sacctmgr update user foo set maxsubmitjobs=0" was ignored because qos
were already set on the partitions.
But... good news!
We hadn't used GrpSubmitJobs in any of our qos, so "sacctmgr modify user
foo set GrpSubmitJobs=0" isn't overridden anywhere, and the effect is
exactly what I wanted - thanks!
But if anyone knows why my attempt at using a "drain" qos stopped foo's
previously submitted jobs from running, I'd be very interested to hear
about it.
Thanks again,
Mark
On Wed, 1 Apr 2020, David Rhey wrote:
Hi Mark,
I *think* you might need to update the user account to have access to
that
QoS (as part of their association). Using sacctmgr modify user <foo> +
some
additional args (they escape me at the moment).
Also, you *might* have been able to set the MaxSubmitJobs at their
account
level to 0 and have them run without having to do the QoS approach - but
that's just a guess on my end based on how we've done some things here.
We
had a "free period" for our clusters and once it was over we set the
GrpSubmit jobs on an account to 0 which allowed in-flight jobs to
continue
but no new work to be submitted.
HTH,
David
On Wed, Apr 1, 2020 at 5:57 AM Mark Dixon <mark.c.di...@durham.ac.uk>
wrote:
Hi all,
I'm a slurm newbie who has inherited a working slurm 16.05.10 cluster.
I'd like to stop user foo from submitting new jobs but allow their
existing jobs to run.
We have several partitions, each with its own qos and MaxSubmitJobs
typically set to some vaue. These qos are stopping a "sacctmgr update
user
foo set maxsubmitjobs=0" from doing anything useful, as per the
documentation.
I've tried setting up a competing qos:
sacctmgr add qos drain
sacctmgr modify qos drain set MaxSubmitJobs=0
sacctmgr modify qos drain set flags=OverPartQOS
sacctmgr modify user foo set qos=drain
This has successfully prevented the user from submitting new jobs, but
their existing jobs aren't running. I'm seeing the reason code
"InvalidQOS".
Any ideas what I should be looking at, please?
Thanks,
Mark
--
David Rhey
---------------
Advanced Research Computing - Technology Services
University of Michigan
--
Mark Dixon <mark.c.di...@durham.ac.uk> Tel: +44(0)191 33 41383
Advanced Research Computing (ARC), Durham University, UK