See the URL below for a good overview of how Slurm works:

https://slurm.schedmd.com/quickstart.html

The way I understand it, tasks are started by Slurmd. Ssh is not involved at all.

SGE does the same thing with 'tight integration'. The tasks are started on the compute nodes by sgeexecd, which spawns an sge sheperd task, which then spawns the actual task.

To really complicate things, you should look at process management interface (PMI). This is a middle layer between Slurm (or an other scheduler) and the MPI tasks. It's a standardized abstraction layer to make programming MPI implementations and schedulers easier. It also increases startup time of the MPI jobs, which is not insignificant for large jobs.

www.mcs.anl.gov/papers/P1760.pdf

Prentice

On 04/05/2018 11:10 AM, Faraz Hussain wrote:
Here's something quite baffling. I have a cluster running slurm but have not setup passwordless ssh for a user yet. So when the user runs "mpirun -n 2 -hostfile hosts hostname", it will hang because of ssh issue. That is expected.

Now the baffling thing is the mpirun command works inside a slurm script! How can it work if passwordless ssh has not been configured? Does slurm use some different authentication (munge?) to login to the hosts and execute the hostname command?

Or does slurm have some fancy behind the scenes integration with Intel mpi ?

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to