Dear there, We tested mpi allreduce job in three modes (srun-dtcp 、mpirun-slurm、mpirun-ssh), and we found that the job running time in the mpirun-ssh mode is shorter than the other modes. We've set parameters like below: /usr/lib/systemd/system/slurmd.service: LimitMEMLOCK=infinity LimitSTACK=infinity /etc/sysconfig/slurmd: ulimit -l unlimited ulimit -s unlimited We want to know if this is normal ?Will functions such as cgroup and pam_slurm_adopt limit job performance ? And how to improve the efficiency of slurm jobs. Here is the test results: sizesrun-dtcpmpirun-slurmmpirun-ssh 00.050.060.05 42551.83355.67281.02 81469.322419.97139 1667.41151.871000.15 3273.31143.15126.22 64107.14111.6126.3 12877.12473.62344.36 25646.39417.9565.09 51292.84260.6990.5 102497.9137.13112.3 2048286.27233.21169.59 4096155.69343.59160.9 8192261.02465.78151.29 1638412518.0413363.713869.58 3276822071.611398.214842.32 655366041.26666.953368.58 13107210436.1118071.5310909.76 26214413802.2224728.788263.53 52428813086.2616394.724825.51 104857628193.0815943.266490.29 209715263277.7324411.5815361.7 419430458538.0560516.1533955.49
(1)srun-dtcp job: #!/bin/bash #SBATCH -J test #SBATCH -N 32 #SBATCH --ntasks-per-node=30 #SBATCH -p seperate NP=$SLURM_NTASKS srun --mpi=pmix_v3 /public/software/benchmark/imb/hpcx/2017/IMB-MPI1 -npmin $NP Allreduce (2)mpirun-slurm job: #!/bin/bash #SBATCH -J test #SBATCH -N 32 #SBATCH --ntasks-per-node=30 #SBATCH -p seperate NP=$SLURM_NTASKS mpirun /public/software/benchmark/imb/hpcx/2017/IMB-MPI1 -npmin $NP Allreduce (3)mpirun-ssh job: #!/bin/bash #SBATCH -J test #SBATCH -N 32 #SBATCH --ntasks-per-node=30 #SBATCH -p seperate env | grep SLURM > env.log scontrol show hostname > nd.$SLURM_JOBID NNODE=$SLURM_NNODES NP=$SLURM_NTASKS mpirun -np $NP -machinefile nd.$SLURM_JOBID -map-by ppr:30:node \ -mca plm rsh -mca plm_rsh_no_tree_spawn 1 -mca plm_rsh_num_concurrent $NNODE \ /public/software/benchmark/imb/hpcx/2017/IMB-MPI1 -npmin $NP Allreduce Best wishes! menglong