[slurm-users] Preemption not working in 20.11

2021-02-26 Thread Prentice Bisbal
We recently upgraded from Slurm 19.05.8 to 20.11.3. In our configuration, we have an interruptible partition named 'interruptible' for long-running, low-priority jobs that use checkpoint/restart. Jobs that are preempted would be killed and requeued rather than suspended. This configuration has

Re: [slurm-users] [External] Preemption not working in 20.11

2021-02-26 Thread Michael Robbert
We saw something that sounds similar to this. See this bug report: https://bugs.schedmd.com/show_bug.cgi?id=10196 SchedMD never found the root cause. They thought it might have something to do with a timing problem on Prolog scripts, but the thing that fixed it for us was to set GraceTime=0 on

Re: [slurm-users] Curious performance results

2021-02-26 Thread Volker Blum
Thank you! I’ll see if this is an option … would be nice. I’ll see if we can try this. Best wishes Volker > On Feb 25, 2021, at 11:07 PM, Angelos Ching > wrote: > > I think it's related to the job step launch semantic change introduced at > 20.11.0, which has been reverted since 20.11.3, see

Re: [slurm-users] GPU exclusively for one account

2021-02-26 Thread Ole Holm Nielsen
On 2/26/21 8:44 AM, Baldauf, Sebastian Martin wrote: I just want to ask if someone has an idea how to give a GPU and some CPUs of a node to one account exclusively but keep the remaining CPUs of this node available for all users. For me it looks like that using partitions is only working for whol