There are some Limitations that mention backfill on the heterogeneous job 
support page.

 

https://slurm.schedmd.com/heterogeneous_jobs.html#limitations

 

Maybe there’s some information there to help?

 

Ken

 

From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of Ana 
Jokanovic
Sent: Thursday, November 29, 2018 4:28 AM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] backfill scheduler does not work for heterogeneous jobs 
(version 17.11)

 

 

 

Hello,

 

I did a simple test submitting the workload of three jobs (see below) on a 
cluster of 5 nodes:

 

sbatch --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 
--time=15 : --cpus-per-task=2 --ntasks=6 --time=15

sbatch --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 
--time=15 : --cpus-per-task=2 --ntasks=6 --time=15

sleep 5;

sbatch --ntasks=1 --time=2 : --ntasks=1 --time=1

 

I would expect that the third submitted job is backfilled but it does not 
happen.

Here is the job completion log:

 

JobId=2 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug 
TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 
EndTime=1543317774 NodeList=s19r2b09 NodeCnt=1 ProcCnt=48 

JobId=3 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug 
TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 
EndTime=1543317774 NodeList=s19r2b10 NodeCnt=1 ProcCnt=48 

JobId=4 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug 
TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 
EndTime=1543317774 NodeList=s19r2b12 NodeCnt=1 ProcCnt=48 

JobId=8 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug 
TimeLimit=00:02:00 SubmitTime=1543317699 StartTime=1543317804 
EndTime=1543317824 NodeList=s19r2b14 NodeCnt=1 ProcCnt=48 

JobId=9 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug 
TimeLimit=00:01:00 SubmitTime=1543317699 StartTime=1543317804 
EndTime=1543317824 NodeList=s19r2b16 NodeCnt=1 ProcCnt=48 

JobId=5 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug 
TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 
EndTime=1543317864 NodeList=s19r2b09 NodeCnt=1 ProcCnt=48 

JobId=6 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug 
TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 
EndTime=1543317864 NodeList=s19r2b10 NodeCnt=1 ProcCnt=48 

JobId=7 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug 
TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 
EndTime=1543317864 NodeList=s19r2b12 NodeCnt=1 ProcCnt=48 

 

Would you expect this behavior?

 

Thanks.

 

Best regards,

Ana

-- 

Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: ana...@gmail.com <mailto:ana...@gmail.com>  or ana.jokano...@bsc.es 
<mailto:ana.jokano...@bsc.es> 
tel: +34 93 4137246




 

-- 

Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: ana...@gmail.com <mailto:ana...@gmail.com>  or ana.jokano...@bsc.es 
<mailto:ana.jokano...@bsc.es> 
tel: +34 93 4137246




 

-- 

Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: ana...@gmail.com <mailto:ana...@gmail.com>  or ana.jokano...@bsc.es 
<mailto:ana.jokano...@bsc.es> 
tel: +34 93 4137246

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to