Greetings,

We are new to Slurm and we are trying to better understand why we’re seeing
high-mem jobs stuck in Pending state indefinitely. Smaller (mem) jobs in
the queue will continue to pass by the high mem jobs even when we bump
priority on a pending high-mem job way up. We have been reading over the
backfill scheduling page and what we think we're seeing is that we need to
require that users specify a --time parameter on their jobs so that
Backfill works properly. None of our users specify a --time param because
we have never required it. Is that what we need to require in order to fix
this situation? From the backfill page:  "Backfill scheduling is difficult
without reasonable time limit estimates for jobs, but some configuration
parameters that can help" and it goes on to list some config params that we
have not set (DefaultTime, MaxTime, OverTimeLimit). We also see language
such as, “Since the expected start time of pending jobs depends upon the
expected completion time of running jobs, reasonably accurate time limits
are important for backfill scheduling to work well.” So we suspect that we
can achieve proper backfill scheduling by requiring that all users supply a
"--time" parameter via a job submit plugin. Would that be a fair statement?



Thank you in advance!

-Mike Schor
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to