Generally OpenMPI will be able to autodetect much of the IB setup you just need to make sure you have UCX.  With modern OpenMPI you will need to build a version of PMIx to hook into Slurm.  Slurm will also need to be built against PMIx as well for best experience.  Thus in terms of order of operations:

1. Make sure UCX is installed and that it is detecting the IB.

2. Make sure to install PMIx

3. Get Slurm up and running built against PMIx (you don't need to build it against UCX).

4. Build OpenMPI against PMIx, Slurm, and UCX.  Generally OpenMPI will autodetect these but you can provide commandline options to ensure that.

5. Test and verify it is working.

-Paul Edmon-

On 10/20/2021 6:08 AM, leo camilo wrote:
 I have recently acquired a few ConnectX-3 cards and an unmanaged IB switch (IS5022) to upgrade my department's beowulf cluster.

Thus far, I have been able to verify that the cards and switch work via the MFT and opensource tools in ubuntu,

Though, I was wondering if anyone knew of any guide or resources for setting up a cluster for MPI based computations in a linux/debian environment? Some guides about how to make it work with SLURM would also be appreciated.

Thanks in advance for any suggestions, I am often a user of clusters, but it is my first time setting one up.

Cheers,

Leonardo

_______________________________________________
Beowulf mailing list,Beowulf@beowulf.org  sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) 
visithttps://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to