It is likely that your job still does not have enough priority to preempt
the scavenge job. Have a look at the output of `sprio` to see the priority
of those jobs and what factors are in play. It may be necessary to
increase the partition priority or adjust some of the job priority factors
to get
Hello,
Thank you for reminding me about the sbatch "--requeue" option. When I
submit test jobs using this option the preemption and subsequent restart of
a job works as expected. I've also played around with "preemptmode=suspend"
and that also works, however I suspect we won't use that on these
"d
I'm a fan of the suspend mode myself but that is dependent on users not
asking for all the ram by default. If you can educate the users then this
works really well as the low priority job stays in ram in suspended mode
while the high priority job completes and then the low priority job
continues f
Along those lines, there is the slurm.conf setting for _JobRequeue_ which
controls the default behavior for jobs' ability to be re-queued.
- Michael
On Fri, Mar 1, 2019 at 7:07 AM Thomas M. Payerle wrote:
> My understanding is that with PreemptMode=requeue, the running scavenger
> job process
My understanding is that with PreemptMode=requeue, the running scavenger
job processes on the node will be killed, but the job will be placed back
int he queue (assuming the job's specific parameters allow this. A job can
have a --no-requeue flag set, in which case I assume it behaves the same as
I have always assumed that cancel just kills the job whereas requeue will
cancel and then start from the beginning. I know that requeue does this. I
never tried cancel.
I'm a fan of the suspend mode myself but that is dependent on users not
asking for all the ram by default. If you can educate the
Hello,
Following up on implementing preemption in Slurm. Thank you again for all
the advice. After a short break I've been able to run some basic
experiments. Initially, I have kept things very simple and made the
following changes in my slurm.conf...
# Premption settings
PreemptType=preempt/part
I just set this up a couple of weeks ago myself. Creating two partitions
is definitely the way to go. I created one partition, "general" for
normal, general-access jobs, and another, "interruptible" for
general-access jobs that can be interrupted, and then set PriorityTier
accordingly in my slu
Hi Marcus,
sure, using Prioritytier is fine. And my point wasn't so much about
preepmtion but exactely about to use just one partition and no
preemption instead of two partitions, which is what David was asking
for, isn't? But actuallym, I forgot that you can do it in one partition
too by using pr
Hi Andreas,
doesn't it suffice to use priority tier partitions? You don't need to
use preemption at all, do you?
Best
Marcus
On 2/18/19 8:27 AM, Henkel, Andreas wrote:
Hi David,
I think there is another option if you don’t want to use preemption.
If the max runlimit is small (several hou
Hi David,
I think there is another option if you don’t want to use preemption. If the max
runlimit is small (several hours for example) working without preemption may be
acceptable.
Assign a qos with a priority boost to the owners of the node. Then whenever
they submit jobs to the partition the
Hi Paul, Marcus,
Thank you for your replies. Using partition priority all makes sense. I was
thinking of doing something similar with a set of nodes purchased by
another group. That is, having a private high priority partition and a
lower priority "scavenger" partition for the public. In this case
Yup, PriorityTier is what we use to do exactly that here. That said
unless you turn on preemption jobs may still pend if there is no space.
We run with REQUEUE on which has worked well.
-Paul Edmon-
On 2/15/19 7:19 AM, Marcus Wagner wrote:
Hi David,
as far as I know, you can use the Prio
Hi David,
as far as I know, you can use the PriorityTier (partition parameter) to
achieve this. According to the manpages (if I remember right) jobs from
higher priority tier partitions have precedence over jobs from lower
priority tier partitions, without taking the normal fairshare priority
Hello.
We have a small set of compute nodes owned by a group. The group has agreed
that the rest of the HPC community can use these nodes providing that they (the
owners) can always have priority access to the nodes. The four nodes are well
provisioned (1 TByte memory each plus 2 GRID K2 graph
15 matches
Mail list logo