On Thu, 05 Apr 2018 09:10:57 -0600 Faraz Hussain <i...@feacluster.com> wrote:
> Here's something quite baffling. I have a cluster running slurm but > have not setup passwordless ssh for a user yet. So when the user > runs "mpirun -n 2 -hostfile hosts hostname", it will hang because of > ssh issue. That is expected. > > Now the baffling thing is the mpirun command works inside a slurm > script! How can it work if passwordless ssh has not been configured? > Does slurm use some different authentication (munge?) to login to > the hosts and execute the hostname command? What happens is that mpirun sees the slurm environment variables and switches to a slurm aware mode. In this mode it uses srun to to launch pmi_proxy processes on each node of the job. Then it proceeds to start all ranks using these pmi_proxy processes. The process tree ends up being something like this on the first node: slurmd->slurmstepd->bash(jobscript)->mpirun->srun -w nodes[..] pmi_proxy And on the other nodes: slurmd->slurmstepd->pmi_proxy->rank[0...n] Authentication/authorization is handled by slurm and depens on how you set it up (often munge). Cheers, Peter K _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf