Thanks, after buidling openmpi 4 from source, it now works! However it
still gives this message below when I run openmpi with verbose setting:
No OpenFabrics connection schemes reported that they were able to be
used on a specific port. As such, the openib BTL (OpenFabrics
support) will be disabled for this port.
Local host: lustwzb34
Local device: mlx4_0
Local port: 1
CPCs attempted: rdmacm, udcm
However, the results from my latency and bandwith tests seem to be
what I would expect from infiniband. See:
[hussaif1@lustwzb34 pt2pt]$ mpirun -v -np 2 -hostfile ./hostfile
./osu_latency
# OSU MPI Latency Test v5.3.2
# Size Latency (us)
0 1.87
1 1.88
2 1.93
4 1.92
8 1.93
16 1.95
32 1.93
64 2.08
128 2.61
256 2.72
512 2.93
1024 3.33
2048 3.81
4096 4.71
8192 6.68
16384 8.38
32768 12.13
65536 19.74
131072 35.08
262144 64.67
524288 122.11
1048576 236.69
2097152 465.97
4194304 926.31
[hussaif1@lustwzb34 pt2pt]$ mpirun -v -np 2 -hostfile ./hostfile ./osu_bw
# OSU MPI Bandwidth Test v5.3.2
# Size Bandwidth (MB/s)
1 3.09
2 6.35
4 12.77
8 26.01
16 51.31
32 103.08
64 197.89
128 362.00
256 676.28
512 1096.26
1024 1819.25
2048 2551.41
4096 3886.63
8192 3983.17
16384 4362.30
32768 4457.09
65536 4502.41
131072 4512.64
262144 4531.48
524288 4537.42
1048576 4510.69
2097152 4546.64
4194304 4565.12
When I run ibv_devinfo I get:
[hussaif1@lustwzb34 pt2pt]$ ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.36.5000
node_guid: 480f:cfff:fff5:c6c0
sys_image_guid: 480f:cfff:fff5:c6c3
vendor_id: 0x02c9
vendor_part_id: 4103
hw_ver: 0x0
board_id: HP_1360110017
phys_port_cnt: 2
Device ports:
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
port: 2
state: PORT_DOWN (1)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
I will ask the openmpi mailing list if my results make sense?!
Quoting Gus Correa <g...@ldeo.columbia.edu>:
Hi Faraz
By all means, download the Open MPI tarball and build from source.
Otherwise there won't be support for IB (the CentOS Open MPI packages most
likely rely only on TCP/IP).
Read their README file (it comes in the tarball), and take a careful look
at their (excellent) FAQ:
https://www.open-mpi.org/faq/
Many issues can be solved by just reading these two resources.
If you hit more trouble, subscribe to the Open MPI mailing list, and ask
questions there,
because you will get advice directly from the Open MPI developers, and the
fix will come easy.
https://www.open-mpi.org/community/lists/ompi.php
My two cents,
Gus Correa
On Tue, Apr 30, 2019 at 3:07 PM Faraz Hussain <i...@feacluster.com> wrote:
Thanks, yes I have installed those libraries. See below. Initially I
installed the libraries via yum. But then I tried installing the rpms
directly from Mellanox website (
MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tar ). Even after doing
that, I still got the same error with openmpi. I will try your
suggestion of building openmpi from source next!
root@lustwzb34:/root # yum list | grep ibverbs
libibverbs.x86_64 41mlnx1-OFED.4.5.0.1.0.45101
libibverbs-devel.x86_64 41mlnx1-OFED.4.5.0.1.0.45101
libibverbs-devel-static.x86_64 41mlnx1-OFED.4.5.0.1.0.45101
libibverbs-utils.x86_64 41mlnx1-OFED.4.5.0.1.0.45101
libibverbs.i686 17.2-3.el7
rhel-7-server-rpms
libibverbs-devel.i686 1.2.1-1.el7
rhel-7-server-rpms
root@lustwzb34:/root # lsmod | grep ib
ib_ucm 22602 0
ib_ipoib 168425 0
ib_cm 53141 3 rdma_cm,ib_ucm,ib_ipoib
ib_umad 22093 0
mlx5_ib 339961 0
ib_uverbs 121821 3 mlx5_ib,ib_ucm,rdma_ucm
mlx5_core 919178 2 mlx5_ib,mlx5_fpga_tools
mlx4_ib 211747 0
ib_core 294554 10
rdma_cm,ib_cm,iw_cm,mlx4_ib,mlx5_ib,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
mlx4_core 360598 2 mlx4_en,mlx4_ib
mlx_compat 29012 15
rdma_cm,ib_cm,iw_cm,mlx4_en,mlx4_ib,mlx5_ib,mlx5_fpga_tools,ib_ucm,ib_core,ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib
devlink 42368 4 mlx4_en,mlx4_ib,mlx4_core,mlx5_core
libcrc32c 12644 3 xfs,nf_nat,nf_conntrack
root@lustwzb34:/root #
> Did you install libibverbs (and libibverbs-utils, for information and
> troubleshooting)?
> yum list |grep ibverbs
> Are you loading the ib modules?
> lsmod |grep ib
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf