It sounds like your second partition is getting primarily scheduled by
the backfill scheduler. I would try the partition_job_depth option as
otherwise the main loop only looks at priority order and not by partition.
-Paul Edmon-
On 4/29/2018 5:32 AM, Zohar Roe MLM wrote:
Hello.
I am having 2 cluster in my slurm.conf:
CLUS_WORK1
server1
server2
server3
CLUS_WORK2
pc1
pc2
pc3
When I'm sending 10,000 jobs to CLUS_WORK1 they are good and start running
while a few are in pending state (which is ok).
But if I send new jobs to CLUS_WORK2 which is idle, I see that the jobs are
also in pending state and its take them about 20 minute to start running.
I didn't find any settings/configuration that can cause that.
Is there some log I can check why they are pending?
Thanks.
***********************************************************************************************
Please consider the environment before printing this email !
The information contained in this communication is proprietary to Israel
Aerospace Industries Ltd. and/or third parties, may contain confidential or
privileged information, and is intended only for the use of the intended
addressee thereof.
If you are not the intended addressee, please be aware that any use,
disclosure, distribution and/or copying of this communication is strictly
prohibited. If you receive this communication in error, please notify the
sender immediately and delete it from your computer.
Thank you.
Visit us at: www.iai.co.il