We've been working to tune our backfill scheduler here. Here is a presentation some of you might have seen at a previous SLUG on tuning the backfill scheduler. HTH!
https://slurm.schedmd.com/SUG14/sched_tutorial.pdf David On Wed, Oct 2, 2019 at 1:37 PM Mark Hahn <h...@mcmaster.ca> wrote: > >(most likely in the next year). My reaction is that Slurm very rarely > >provides an estimated start time for a job. I understand that this is not > >possible for jobs on hold and dependent jobs. > > it's also not possible if both running and queued jobs > lack definite termination times; do yours? > > my understanding is the following: > the main scheduler does not perform forward planning. > that is, it is opportunistic. it walks the list of priority-sorted > pending jobs, starting any which can run on currently free > (or preemptable) resources. > > the backfill scheduler is a secondary, asynchronous loop that tries hard > not to interfere with the main scheduler (severely throttles itself) > and tries to place start times for pending jobs. > > the main issue with forward scheduling is that if high-prio jobs become > runnable (submitted, off hold, dependency-satisfied), then most of the > (tentative) start times probably need to be removed. > > a quick look at plugins/sched/backfill/backfill.c indicates that things > are /complicated/ ;) > > we (ComputeCanada) don't see a lot of forward start times either. > > I also would welcome discussion of how to tune the backfill scheduler! > I suspect that in order to work well, it needs a particular distribution > of job priorities. > > regards, mark hahn. > > -- David Rhey --------------- Advanced Research Computing - Technology Services University of Michigan