Generally OpenMPI will be able to autodetect much of the IB setup you
just need to make sure you have UCX. With modern OpenMPI you will need
to build a version of PMIx to hook into Slurm. Slurm will also need to
be built against PMIx as well for best experience. Thus in terms of
order of operations:
1. Make sure UCX is installed and that it is detecting the IB.
2. Make sure to install PMIx
3. Get Slurm up and running built against PMIx (you don't need to build
it against UCX).
4. Build OpenMPI against PMIx, Slurm, and UCX. Generally OpenMPI will
autodetect these but you can provide commandline options to ensure that.
5. Test and verify it is working.
-Paul Edmon-
On 10/20/2021 6:08 AM, leo camilo wrote:
I have recently acquired a few ConnectX-3 cards and an unmanaged IB
switch (IS5022) to upgrade my department's beowulf cluster.
Thus far, I have been able to verify that the cards and switch work
via the MFT and opensource tools in ubuntu,
Though, I was wondering if anyone knew of any guide or resources for
setting up a cluster for MPI based computations in a linux/debian
environment? Some guides about how to make it work with SLURM would
also be appreciated.
Thanks in advance for any suggestions, I am often a user of clusters,
but it is my first time setting one up.
Cheers,
Leonardo
_______________________________________________
Beowulf mailing list,Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe)
visithttps://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf