[slurm-users] why is the performance fo mpi allreduce job lower than the expected level?

hu...@sugon.com Thu, 24 Dec 2020 01:36:51 -0800

Dear there,
    We tested mpi allreduce job in three modes (srun-dtcp 
、mpirun-slurm、mpirun-ssh), and we found that the job running time in the  
mpirun-ssh mode is shorter than the other modes. 
    We've set parameters like below:
    /usr/lib/systemd/system/slurmd.service:
        LimitMEMLOCK=infinity
        LimitSTACK=infinity
    /etc/sysconfig/slurmd:
        ulimit -l unlimited
        ulimit -s unlimited
    We want to know if this is normal ?Will functions such as cgroup and 
pam_slurm_adopt limit job performance ? And how to improve the efficiency of 
slurm jobs.
    Here is the test results:
sizesrun-dtcpmpirun-slurmmpirun-ssh
00.050.060.05
42551.83355.67281.02
81469.322419.97139
1667.41151.871000.15
3273.31143.15126.22
64107.14111.6126.3
12877.12473.62344.36
25646.39417.9565.09
51292.84260.6990.5
102497.9137.13112.3
2048286.27233.21169.59
4096155.69343.59160.9
8192261.02465.78151.29
1638412518.0413363.713869.58
3276822071.611398.214842.32
655366041.26666.953368.58
13107210436.1118071.5310909.76
26214413802.2224728.788263.53
52428813086.2616394.724825.51
104857628193.0815943.266490.29
209715263277.7324411.5815361.7
419430458538.0560516.1533955.49


(1)srun-dtcp job:
#!/bin/bash
#SBATCH -J test
#SBATCH -N 32
#SBATCH --ntasks-per-node=30
#SBATCH -p seperate

NP=$SLURM_NTASKS
srun --mpi=pmix_v3 /public/software/benchmark/imb/hpcx/2017/IMB-MPI1 -npmin $NP 
Allreduce

(2)mpirun-slurm job:
#!/bin/bash
#SBATCH -J test
#SBATCH -N 32
#SBATCH --ntasks-per-node=30
#SBATCH -p seperate

NP=$SLURM_NTASKS
mpirun /public/software/benchmark/imb/hpcx/2017/IMB-MPI1 -npmin $NP Allreduce

(3)mpirun-ssh job:
#!/bin/bash
#SBATCH -J test
#SBATCH -N 32
#SBATCH --ntasks-per-node=30
#SBATCH -p seperate

env | grep SLURM > env.log 
scontrol show hostname > nd.$SLURM_JOBID
NNODE=$SLURM_NNODES
NP=$SLURM_NTASKS

mpirun -np $NP -machinefile nd.$SLURM_JOBID -map-by ppr:30:node \
-mca plm rsh -mca plm_rsh_no_tree_spawn 1 -mca plm_rsh_num_concurrent $NNODE \
/public/software/benchmark/imb/hpcx/2017/IMB-MPI1 -npmin $NP Allreduce



Best wishes!
menglong

[slurm-users] why is the performance fo mpi allreduce job lower than the expected level?

Reply via email to