Certainly it would help. Setting reasonable defaults for time is good
idea just in general. For instance we set 10 minutes as our default and
anything longer people have to explicitly request (up to the MaxTime for
the partition).
More to the point, what Reason does the scheduler give for what the job
is pending? If you do squeue or scontrol show job it should list the
reason why its pending. If its Resources, then the scheduler is waiting
for sufficient resources to free up to scheduler. If its is Priority
then the job is pending due to other jobs ahead of it.
-Paul Edmon-
On 5/6/2025 11:05 AM, Mike via slurm-users wrote:
Greetings,
We are new to Slurm and we are trying to better understand why we’re
seeing high-mem jobs stuck in Pending state indefinitely. Smaller
(mem) jobs in the queue will continue to pass by the high mem jobs
even when we bump priority on a pending high-mem job way up. We have
been reading over the backfill scheduling page and what we think we're
seeing is that we need to require that users specify a --time
parameter on their jobs so that Backfill works properly. None of our
users specify a --time param because we have never required it. Is
that what we need to require in order to fix this situation? From the
backfill page: "Backfill scheduling is difficult without reasonable
time limit estimates for jobs, but some configuration parameters that
can help" and it goes on to list some config params that we have not
set (DefaultTime, MaxTime, OverTimeLimit). We also see language such
as, “Since the expected start time of pending jobs depends upon the
expected completion time of running jobs, reasonably accurate time
limits are important for backfill scheduling to work well.” So we
suspect that we can achieve proper backfill scheduling by requiring
that all users supply a "--time" parameter via a job submit plugin.
Would that be a fair statement?
Thank you in advance!
-Mike Schor
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com