See the URL below for a good overview of how Slurm works:
https://slurm.schedmd.com/quickstart.html
The way I understand it, tasks are started by Slurmd. Ssh is not
involved at all.
SGE does the same thing with 'tight integration'. The tasks are started
on the compute nodes by sgeexecd, which spawns an sge sheperd task,
which then spawns the actual task.
To really complicate things, you should look at process management
interface (PMI). This is a middle layer between Slurm (or an other
scheduler) and the MPI tasks. It's a standardized abstraction layer to
make programming MPI implementations and schedulers easier. It also
increases startup time of the MPI jobs, which is not insignificant for
large jobs.
www.mcs.anl.gov/papers/P1760.pdf
Prentice
On 04/05/2018 11:10 AM, Faraz Hussain wrote:
Here's something quite baffling. I have a cluster running slurm but
have not setup passwordless ssh for a user yet. So when the user runs
"mpirun -n 2 -hostfile hosts hostname", it will hang because of ssh
issue. That is expected.
Now the baffling thing is the mpirun command works inside a slurm
script! How can it work if passwordless ssh has not been configured?
Does slurm use some different authentication (munge?) to login to the
hosts and execute the hostname command?
Or does slurm have some fancy behind the scenes integration with Intel
mpi ?
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf