Good morning everyone.
I'm having a "issue", I don't know if it is a "bug or a feature".I've created a QOS: "sacctmgr add qos myqos set GrpTRESMins=cpu=10 flags=NoDecay".
I know the limit it too low, but I just wanted to give you guys an example.Whenever a user submits a job and uses this QOS, if the job reaches the limit I've defined, the job is canceled and I loose and the computation it had done so far. Is it possible to create a QOS/slurm setting that when the users reach the limit, it changes the job state to pending? This way I can increase the limits, change the job state to Runnig so it can continue until it reaches completion. I know this is a little bit odd, but I have users that have requested cpu time as per an agreement between our HPC center and their institutions. I know limits are set so they can be enforced, what I'm trying to prevent is for example, a person having a job running for 2 months and at the end not having any data because they just needed a few more days. This could be prevented if I could grant them a couple more days of cpu, if the job went on to a pending state after reaching the limit.
*Cumprimentos / Best Regards,* Zacarias Benta INCD @ LIP - Universidade do Minho INCD Logo
smime.p7s
Description: S/MIME Cryptographic Signature