Hi Faraz
1) lsmod | grep ib should show if the Infinband kernel modules are loaded.
2) Infinband normally uses remote DMA (rdma) through "verbs".
You should see an "ib" module with "verbs" in the name.
That is the preferred/faster mode for MPI.
3) However, you can also use Infinband for TCP/IP (slower).
As the output of your ifconfig shows, your ib0 interface is
also configured for TCP/IP.
4) You may have two interfaces (one card with two or two cards) in the
nodes. One may not be connected to a switch (ib1). Check the back of
your nodes.
5) To check if MPI is using it, depends a bit on which MPI library
you're using.
Which one? Open MPI, MVAPICH2, some vendor/proprietary one?
If it is Open MPI the command "ompi-info" will tell.
With Open MPI there are also ways to enable/disable
Infiniband at runtime.
6) Some Infinband diagnostics may also help (normally in /usr/sbin)
ibstat
ibhosts
ibnetdiscover
etc
OK, this is my pedestrian view of Infinband.
Now let's hear the experts in the list for deeper insights. :)
I hope this helps,
Gus Correa
On 08/02/2017 12:44 PM, Faraz Hussain wrote:
I have inherited a 20-node cluster that supposedly has an infiniband
network. I am testing some mpi applications and am seeing no performance
improvement with multiple nodes. So I am wondering if the Infiband
network even works?
The output of ifconfig -a shows an ib0 and ib1 network. I ran ethtools
ib0 and it shows:
Speed: 40000Mb/s
Link detected: no
and for ib1 it show:
Speed: 10000Mb/s
Link detected: no
I am assuming this means it is down? Any idea how to debug further and
restart it?
Thanks!
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf