Thank you so much for the prompt response! This makes a lot of sense. We hadn't seen this explicitly stated in the docs, but it's what we gleaned from them.
On Tue, May 6, 2025 at 11:14 AM Paul Edmon via slurm-users < slurm-users@lists.schedmd.com> wrote: > Certainly it would help. Setting reasonable defaults for time is good idea > just in general. For instance we set 10 minutes as our default and anything > longer people have to explicitly request (up to the MaxTime for the > partition). > > More to the point, what Reason does the scheduler give for what the job is > pending? If you do squeue or scontrol show job it should list the reason > why its pending. If its Resources, then the scheduler is waiting for > sufficient resources to free up to scheduler. If its is Priority then the > job is pending due to other jobs ahead of it. > > -Paul Edmon- > On 5/6/2025 11:05 AM, Mike via slurm-users wrote: > > Greetings, > > > We are new to Slurm and we are trying to better understand why we’re > seeing high-mem jobs stuck in Pending state indefinitely. Smaller (mem) > jobs in the queue will continue to pass by the high mem jobs even when we > bump priority on a pending high-mem job way up. We have been reading over > the backfill scheduling page and what we think we're seeing is that we need > to require that users specify a --time parameter on their jobs so that > Backfill works properly. None of our users specify a --time param because > we have never required it. Is that what we need to require in order to fix > this situation? From the backfill page: "Backfill scheduling is difficult > without reasonable time limit estimates for jobs, but some configuration > parameters that can help" and it goes on to list some config params that we > have not set (DefaultTime, MaxTime, OverTimeLimit). We also see language > such as, “Since the expected start time of pending jobs depends upon the > expected completion time of running jobs, reasonably accurate time limits > are important for backfill scheduling to work well.” So we suspect that we > can achieve proper backfill scheduling by requiring that all users supply a > "--time" parameter via a job submit plugin. Would that be a fair statement? > > > > Thank you in advance! > > -Mike Schor > > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com