Hi Michael, Yes, without the singleton it works as expected: $ sbatch --hold fakejob.sh Submitted batch job 26636869 $ sbatch --hold fakejob.sh Submitted batch job 26636870 $ sbatch --hold fakejob.sh Submitted batch job 26636871 $ scontrol update jobid=26636870 Dependency=after:26636871 $ scontrol update jobid=26636871 Dependency=after:26636869 $ scontrol release 26636869 26636870 26636871 $ squeue -u jarno JOBID USER ACCOUNT NAME ST TIME_LEFT NODES CPUS GRES MIN_MEM NODELIST (REASON) 26636869 jarno def-jarno_cp fakejob R 1:35 1 1 (null) 250M cdr650 (None) 26636871 jarno def-jarno_cp fakejob R 1:39 1 1 (null) 250M cdr652 (None) 26636870 jarno def-jarno_cp fakejob R 1:42 1 1 (null) 250M cdr667 (None)
Thanks, Jarno Jarno van der Kolk, PhD Phys. Analyste principal en informatique scientifique | Senior Scientific Computing Specialist Solutions TI | IT Solutions Université d’Ottawa | University of Ottawa ________________________________ From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Michael Di Domenico <mdidomeni...@gmail.com> Sent: August 28, 2019 10:26 AM To: Slurm User Community List <slurm-users@lists.schedmd.com> Subject: Re: [slurm-users] Dependencies with singleton and after Attention : courriel externe | external email just curious. if you leave out the singleton, do you get the behavior as expected? On Tue, Aug 27, 2019 at 9:42 AM Jarno van der Kolk <jvand...@uottawa.ca> wrote: > > Hi all, > > I'm still puzzled by the expected behaviour of the following: > $ sbatch --hold fakejob.sh > Submitted batch job 25909273 > $ sbatch --hold fakejob.sh > Submitted batch job 25909274 > $ sbatch --hold fakejob.sh > Submitted batch job 25909275 > $ scontrol update jobid=25909273 Dependency=singleton > $ scontrol update jobid=25909274 Dependency=singleton,after:25909275 > $ scontrol update jobid=25909275 Dependency=singleton,after:25909273 > $ scontrol release 25909273 25909274 25909275 > > I expected these to be executed as 25909273, 25909275, 25909274. However, it > seems that singletons are executed in order of submission so that this leads > to a circular dependency. That is, 25909274 depends on 25909275 due to > "after", and 25909275 depends on 25909274 due to "singleton" plus order of > submission. > > From the man page for sbatch, that wasn't really clear to me: > singleton > This job can begin execution after any previously > launched jobs sharing the same > job name and user have terminated. > > I'm somewhat interested in creating a patch for this, but before I can look > into this, I'll need to know what the expected behaviour is. > If "launched" means submitted to the queue and preserving order, then I > should focus on the circular dependency detection. > If "launched" means entered the running state without preserving order, then > I should focus on the dependency resolving. > > Any thoughts on this? > > Thanks, > Jarno > > Jarno van der Kolk, PhD Phys. > Analyste principal en informatique scientifique | Senior Scientific Computing > Specialist > Solutions TI | IT Solutions > Université d’Ottawa | University of Ottawa >