Isn't latency over RDMA a bit high? When I've tested QDR and FDR I tend to see around 1 us (using mpitests-osu_latency) between two nodes.

/jon

On 08/03/2017 06:50 PM, Faraz Hussain wrote:
Here is the result from the tcp and rdma tests. I take it to mean that IB network is performing at the expected speed.

[hussaif1@lustwzb5 ~]$ qperf lustwzb4 -t 30 tcp_lat tcp_bw
tcp_lat:
    latency  =  24.2 us
tcp_bw:
    bw  =  1.19 GB/sec
[hussaif1@lustwzb5 ~]$ qperf lustwzb4 -t 30 rc_lat rc_bw
rc_lat:
    latency  =  7.76 us
rc_bw:
    bw  =  4.56 GB/sec
[hussaif1@lustwzb5 ~]$


Quoting Jeff Johnson <jeff.john...@aeoncomputing.com>:

Faraz,

I didn't notice any tests where you actually tested the ip layer. You
should run some iperf tests between nodes to make sure ipoib functions.
Your infiniband/rdma can be working fine and ipoib can be dysfunctional.
You need to ensure the ipoib configuration, like any ip environment, is
configured the same on all nodes (network/subnet, netmask, mtu, etc) and
that all of the nodes are configured for the same mode (connected vs
datagram). If you can't run iperf then there is something broken in the
ipoib configuration.

--Jeff

On Thu, Aug 3, 2017 at 8:41 AM, Faraz Hussain <i...@feacluster.com> wrote:

Thanks for everyone's help. Using the Ohio State tests, qperf and
perfquery I am convinced the IB network is working. The only thing that
still bothers me is I can not get mpirun to use the tcp network. I tried all combinations of --mca btl to no avail. It is not important, more just
curiosity.



Quoting Michael Di Domenico <mdidomeni...@gmail.com>:

On Thu, Aug 3, 2017 at 10:10 AM, Faraz Hussain <i...@feacluster.com>
wrote:

Thanks, I installed the MPI tests from Ohio State. I ran osu_bw and got
the
results below. What is confusing is I get the same result if I use tcp or openib ( by doing --mca btl openib|tcp,self with my mpirun command ). I
also
tried changing the environment variable: export OMPI_MCA_btl=tcp,self,sm
.
Results are the same regardless of tcp or openib..

And when I do ifconfig -a I still see zero traffic reported for the ib0
and
ib1 network.


if openmpi uses RDMA for the traffic ib0/ib1 will not show traffic,
you have to use perfquery
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf




_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf




--
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite D - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage



_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to