I'm a fan of the suspend mode myself but that is dependent on users not
asking for all the ram by default. If you can educate the users then this
works really well as the low priority job stays in ram in suspended mode
while the high priority job completes and then the low priority job
continues f
Along those lines, there is the slurm.conf setting for _JobRequeue_ which
controls the default behavior for jobs' ability to be re-queued.
- Michael
On Fri, Mar 1, 2019 at 7:07 AM Thomas M. Payerle wrote:
> My understanding is that with PreemptMode=requeue, the running scavenger
> job process
My understanding is that with PreemptMode=requeue, the running scavenger
job processes on the node will be killed, but the job will be placed back
int he queue (assuming the job's specific parameters allow this. A job can
have a --no-requeue flag set, in which case I assume it behaves the same as
I have always assumed that cancel just kills the job whereas requeue will
cancel and then start from the beginning. I know that requeue does this. I
never tried cancel.
I'm a fan of the suspend mode myself but that is dependent on users not
asking for all the ram by default. If you can educate the
We're one of the many Slurm sites which run the slurmdbd database daemon
on the same server as the slurmctld daemon. This works without problems
at our site given our modest load, however, SchedMD recommends to run
the daemons on separate servers.
Contemplating how to upgrade our cluster from
Hello,
Following up on implementing preemption in Slurm. Thank you again for all
the advice. After a short break I've been able to run some basic
experiments. Initially, I have kept things very simple and made the
following changes in my slurm.conf...
# Premption settings
PreemptType=preempt/part