Re: [slurm-users] Priority access for a group of users

2019-03-06 Thread Michael Gutteridge
It is likely that your job still does not have enough priority to preempt the scavenge job. Have a look at the output of `sprio` to see the priority of those jobs and what factors are in play. It may be necessary to increase the partition priority or adjust some of the job priority factors to get

Re: [slurm-users] Priority access for a group of users

2019-03-04 Thread david baker
Hello, Thank you for reminding me about the sbatch "--requeue" option. When I submit test jobs using this option the preemption and subsequent restart of a job works as expected. I've also played around with "preemptmode=suspend" and that also works, however I suspect we won't use that on these "d

Re: [slurm-users] Priority access for a group of users

2019-03-01 Thread Mark Hahn
I'm a fan of the suspend mode myself but that is dependent on users not asking for all the ram by default. If you can educate the users then this works really well as the low priority job stays in ram in suspended mode while the high priority job completes and then the low priority job continues f

Re: [slurm-users] Priority access for a group of users

2019-03-01 Thread Michael Gutteridge
Along those lines, there is the slurm.conf setting for _JobRequeue_ which controls the default behavior for jobs' ability to be re-queued. - Michael On Fri, Mar 1, 2019 at 7:07 AM Thomas M. Payerle wrote: > My understanding is that with PreemptMode=requeue, the running scavenger > job process

Re: [slurm-users] Priority access for a group of users

2019-03-01 Thread Thomas M. Payerle
My understanding is that with PreemptMode=requeue, the running scavenger job processes on the node will be killed, but the job will be placed back int he queue (assuming the job's specific parameters allow this. A job can have a --no-requeue flag set, in which case I assume it behaves the same as

Re: [slurm-users] Priority access for a group of users

2019-03-01 Thread Antony Cleave
I have always assumed that cancel just kills the job whereas requeue will cancel and then start from the beginning. I know that requeue does this. I never tried cancel. I'm a fan of the suspend mode myself but that is dependent on users not asking for all the ram by default. If you can educate the

Re: [slurm-users] Priority access for a group of users

2019-03-01 Thread david baker
Hello, Following up on implementing preemption in Slurm. Thank you again for all the advice. After a short break I've been able to run some basic experiments. Initially, I have kept things very simple and made the following changes in my slurm.conf... # Premption settings PreemptType=preempt/part

Re: [slurm-users] Priority access for a group of users

2019-02-19 Thread Prentice Bisbal
I just set this up a couple of weeks ago myself. Creating two partitions is definitely the way to go. I created one partition, "general" for normal, general-access jobs, and another, "interruptible" for general-access jobs that can be interrupted, and then set PriorityTier accordingly in my slu

Re: [slurm-users] Priority access for a group of users

2019-02-18 Thread Henkel
Hi Marcus, sure, using Prioritytier is fine. And my point wasn't so much about preepmtion but exactely about to use just one partition and no preemption instead of two partitions, which is what David was asking for, isn't? But actuallym, I forgot that you can do it in one partition too by using pr

Re: [slurm-users] Priority access for a group of users

2019-02-18 Thread Marcus Wagner
Hi Andreas, doesn't it suffice to use priority tier partitions? You don't need to use preemption at all, do you? Best Marcus On 2/18/19 8:27 AM, Henkel, Andreas wrote: Hi David, I think there is another option if you don’t want to use preemption. If the max runlimit is small (several hou

Re: [slurm-users] Priority access for a group of users

2019-02-17 Thread Henkel, Andreas
Hi David, I think there is another option if you don’t want to use preemption. If the max runlimit is small (several hours for example) working without preemption may be acceptable. Assign a qos with a priority boost to the owners of the node. Then whenever they submit jobs to the partition the

Re: [slurm-users] Priority access for a group of users

2019-02-15 Thread david baker
Hi Paul, Marcus, Thank you for your replies. Using partition priority all makes sense. I was thinking of doing something similar with a set of nodes purchased by another group. That is, having a private high priority partition and a lower priority "scavenger" partition for the public. In this case

Re: [slurm-users] Priority access for a group of users

2019-02-15 Thread Paul Edmon
Yup, PriorityTier is what we use to do exactly that here.  That said unless you turn on preemption jobs may still pend if there is no space.  We run with REQUEUE on which has worked well. -Paul Edmon- On 2/15/19 7:19 AM, Marcus Wagner wrote: Hi David, as far as I know, you can use the Prio

Re: [slurm-users] Priority access for a group of users

2019-02-15 Thread Marcus Wagner
Hi David, as far as I know, you can use the PriorityTier (partition parameter) to achieve this. According to the manpages (if I remember right) jobs from higher priority tier partitions have precedence over jobs from lower priority tier partitions, without taking the normal fairshare priority

[slurm-users] Priority access for a group of users

2019-02-15 Thread David Baker
Hello. We have a small set of compute nodes owned by a group. The group has agreed that the rest of the HPC community can use these nodes providing that they (the owners) can always have priority access to the nodes. The four nodes are well provisioned (1 TByte memory each plus 2 GRID K2 graph