It's the association (account) limit. The problem being that lower priority jobs were backfilling (even with the builtin scheduler) around this larger job preventing it from running.
I have found what looks like the solution. I need to switch to the builtin scheduler and add "assoc_limit_stop" to "SchedulerParameters". From slurm.conf(5): assoc_limit_stop If set and a job cannot start due to association limits, then do not attempt to initiate any lower priority jobs in that partition. Setting this can decrease system throughput and utilization, but avoid potentially starving larger jobs by preventing them from launching indefinitely. I've made those changes and then only the lower priority jobs wait for the larger, higher-priority, job. I must have looked past that section of the manpage a dozen times 8-/ before making the connection. It doesn't seem to fix this when I use the backfill scheduler, but that may be due to the runtimes on the various jobs. For us, switching to builtin actually makes more sense for our cloud cluster setup, so no problems making that change. Thanks to all for your time looking at the problem. Best Michael On Thu, Feb 28, 2019 at 7:54 AM Chris Samuel <ch...@csamuel.org> wrote: > On 28/2/19 7:29 am, Michael Gutteridge wrote: > > > 2221670 largenode sleeper. me PD N/A 1 > > (null) (AssocGrpCpuLimit) > > That says the job exceeds some policy limit you have set and so is not > permitted to start, looks like you've got a limit on the number of cores > that an association has in the hierarchy either at or above that level > that this would exceed. > > You'll probably need to go poking around with sacctmgr to see what that > limit might be. > > All the best, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA > >