Hello Slurm Users, I am experimenting with the new --prefer soft constraint option in 22.05. The option behaves as described, but is somewhat inefficient if many jobs with different --prefer options are submitted. Here is the scenario: 1. submit array of 100 tasks preferring feature A, each task requires 1 CPU 2. submit array of 100 tasks preferring feature B, each task requires 1 CPU There are two nodes in the cluster: node A, 20 CPUs, has feature A node B, 20 CPUs, has feature B What I observe is: 1. 20 jobs from the *first *array are launched on node *A* (--prefer feature A) 2. 20 jobs from the *first *array are launched on node *B* (--prefer feature A) The 2nd step from above seems sub-optimal, since 20 jobs from the second array preferring feature B could have been launched on node B instead. Ideally, jobs that don't meet the prefer soft constraint can get a lower priority, so that jobs that do meet the soft constraints have a chance to be evaluated and launched. Is this possible?
If not already possible, I would like to try some ideas by modifying the source code. From the man page, I see *--prefer*=<*list*> <https://slurm.schedmd.com/sbatch.html#OPT_prefer>Nodes can have *features* assigned to them by the Slurm administrator. Users can specify which of these *features* are desired but not required by their job using the prefer option. This option operates independently from *--constraints* and will override whatever is set there if possible. When scheduling the features in *--prefer* are tried first if a node set isn't available with those features then *--constraints* is attempted. See *--constraints* for more information, this option behaves the same way. If someone can point me to where this fallback logic (i.e. from --prefer to --constraint, which could be empty) is implemented in the source code (somewhere in src/slurmctld?), I can try to hack that part and recompile slurm, and report my findings. Thank you,Manchang
