Hi,

We have a strange problem with backfilling, there are
large partition "cpu" and overlapping partition "largemem" which is a subset of "cpu" nodes.

Now, user A is submitting low priority jobs to "cpu", user B high priority jobs to "largemem" If there are queued jobs in "largemem" (draining nodes there), the slurmctld would never backfill the "cpu".  At the extreme, non-overlapping "cpu" nodes  would get empty until higher prio jobs get all running in "largemem"

Any hint or workaround here? backfill works quite fine if all the jobs are submitted to "cpu" partition. User A has typically smaller and shorter jobs, good for backfilling.

we use these settings with slurm:
PriorityType=priority/multifactor
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_CORE_MEMORY,CR_CORE_DEFAULT_DIST_BLOCK
SchedulerParameters     = bf_max_job_test=2000,bf_window=1440,default_queue_depth=1000,bf_continue

Best regards,
Andrej

--
_____________________________________________________________
   prof. dr. Andrej Filipcic,   E-mail: andrej.filip...@ijs.si
   Department of Experimental High Energy Physics - F9
   Jozef Stefan Institute, Jamova 39, P.o.Box 3000
   SI-1001 Ljubljana, Slovenia
   Tel.: +386-1-477-3674    Fax: +386-1-425-7074
-------------------------------------------------------------


Reply via email to