Thank you all for such interesting replies. The --dependency option is quite useful but in practice it has some inconvenients. Firstly, all 20 jobs are *instantly queued* which some users may be interpreting as an abusive use of common resources. Even worse, if a job fails, the rest one will stay queued forever (?) being the first tagged as "DependencyNeverSatisfied", and the rest just as "Dependency".
PS: Yarom, with queue time I meant the total run time allowed. I my case, after a job starts running it will be killed if it takes more than 10 hours of execution time. If the partition queue time limit were of 10 days for instance I guess I could use a single sbatch to launch an script containing the 20 executions as steps with srun Regards, Nigella El lun., 25 nov. 2019 a las 15:08, Yair Yarom (<ir...@cs.huji.ac.il>) escribió: > Hi, > > I'm not sure what queue time limit of 10 hours is. If you can't have jobs > waiting for more than 10 hours, than it seems to be very small for 8 hours > jobs. > Generally, a few options: > a. The --dependency option (either afterok or singleton) > b. The --array option of sbatch with limit of 1 job at a time (instead of > the for loop): sbatch --array=1-20%1 > c. At the end of the script of each job, call the sbatch line of the next > job (this is probably the only option if indeed I understood the queue time > limit correctly). > > And indeed, srun should probably be reserved for strictly interactive jobs. > > Regards, > Yair. > > On Mon, Nov 25, 2019 at 11:21 AM Nigella Sanders < > nigella.sand...@gmail.com> wrote: > >> >> Hi all, >> >> I guess this is a simple matter but I still find it confusing. >> >> I have to run 20 jobs on our supercomputer. >> Each job takes about 8 hours and every one need the previous one to be >> completed. >> The queue time limit for jobs is 10 hours. >> >> So my first approach is serially launching them in a loop using srun: >> >> >> *#!/bin/bash* >> *for i in {1..20};do* >> >> * srun --time 08:10:00 [options]* >> >> *done* >> >> However SLURM literature keeps saying that 'srun' should be only used for >> short command line tests. So that some sysadmins would consider this a bad >> practice (see this >> <https://stackoverflow.com/questions/43767866/slurm-srun-vs-sbatch-and-their-parameters> >> ). >> >> My second approach switched to sbatch: >> >> * #!/bin/bash * >> *for i in {1..20};do* >> * sbatch --time 08:10:00 [options]* >> >> * [polling to queue to see if job is done]* >> *done* >> >> But since sbatch returns the prompt I had to add code to check for job >> termination. Polling make use of sleep command and it is prone to race >> conditions so it doesn't like to sysadmins either. >> >> I guess there must be a --wait option in some recent versions of SLURM (see >> this <https://bugs.schedmd.com/show_bug.cgi?id=1685>). Not yet available >> in our system though. >> >> Is there any prefererable/canonical/friendly way to do this? >> Any thoughts would be really appreciated, >> >> Regards, >> Nigella. >> >> >> >