Sure ;) My example was just for fast reproductivity.

The complete job farm script is (if that's of interest):

----------------------------------------->

#!/bin/bash
#SBATCH -J jobfarm_test
#SBATCH -o log.%x.%j.%N.out
#SBATCH -D ./
#SBATCH --mail-type=NONE
#SBATCH --time=00:05:00
#SBATCH --export=NONE
#SBATCH --get-user-env
#SBATCH --clusters=...
#SBATCH --partition=..
#SBATCH --qos=lrz_admin
#SBATCH --nodes=2
#SBATCH --ntasks=4               # needed for SLURM_NTASKS on parallel below

module load slurm_setup    # LRZ specific
module load parallel             # GNU parallel

# Hyperthreading
export OMP_NUM_THREADS=28
export MY_SLURM_PARAMS="-N 1 -n 1 -c 28 --threads-per-core=2 --mem=27G --exact 
--export=ALL --cpu_bind=verbose,cores --mpi=none"

## Not Hyperthreading
#export OMP_NUM_THREADS=14
#export MY_SLURM_PARAMS="-N 1 -n 1 -c 28 --threads-per-core=2 --mem=27G --exact 
--export=ALL --cpu_bind=verbose,cores --mpi=none"

export MYEXEC=/lrz/sys/tools/placement_test_2021/bin/placement-test.omp_only
export PARAMS=("-d 20" "-d 10" "-d 20" "-d 10" "-d 20" "-d 10" "-d 20" "-d 10")

task() {
   echo "srun $MY_SLURM_PARAMS $MYEXEC $2 &> log2.$1"
   srun $MY_SLURM_PARAMS $MYEXEC $2 &> log2.$1
}
export -f task

parallel -P $SLURM_NTASKS task {#} {} ::: "${PARAMS[@]}"
----------------------------------------->

The good thing here is that users may only need to modify the SBATCH header and 
the exported environment variables.


But Magnus (Thanks for the Link!) is right. This is still far away from a 
feature rich job- or task-farming concept, where at least some overview of the 
passed/failed/missing task statistics is available etc.

But as a solution for about some dozens of tasks, the above is imho a feasible 
and flexible solution (as long as srun keeps playing by the rules).


For 1000 tasks, sure, something else is needed. I played with julia's pmap 
(https://doku.lrz.de/display/PUBLIC/FAQ%3A+Julia+on+SuperMUC-NG+and+HPC+Systems#FAQ:JuliaonSuperMUCNGandHPCSystems-MoreExamplesandExampleUseCases)
 what however also reacted quite negative on the srun-interface changes. So, I 
was a bit diverting from it again. Maybe too easily :scratch_head:

Anyway, Magnus, I will try it.


Huge thanks you all!

Kind regards,

Martin





________________________________
Von: slurm-users <slurm-users-boun...@lists.schedmd.com> im Auftrag von Ward 
Poelmans <ward.poelm...@vub.be>
Gesendet: Mittwoch, 18. Januar 2023 15:00
An: slurm-users@lists.schedmd.com
Betreff: Re: [slurm-users] srun jobfarming hassle question

Hi Martin,

Just a tip: use gnu parallel instead of a for loop. Much easier and more 
powerful.

Like:

parallel -j $SLURM_NTASKS srun -N 1 -n 1 -c 1 --exact <command> ::: *.input


Ward

Reply via email to