Here are the latency numbers when running the Ohio State test:
mpirun -np 2 -machinefile hostfile ./osu_latency
# OSU MPI Latency Test v5.3.2
# Size Latency (us)
0 1.57
1 1.22
2 1.19
4 1.20
8 1.17
16 1.20
32 1.23
64 1.29
128 1.42
256 1.76
512 2.07
1024 2.62
2048 3.63
4096 4.65
8192 6.46
16384 10.34
32768 13.37
65536 19.03
131072 33.04
262144 61.70
524288 119.93
1048576 231.21
2097152 455.84
4194304 907.89
Quoting Jon Tegner <teg...@renget.se>:
Isn't latency over RDMA a bit high? When I've tested QDR and FDR I
tend to see around 1 us (using mpitests-osu_latency) between two
nodes.
/jon
On 08/03/2017 06:50 PM, Faraz Hussain wrote:
Here is the result from the tcp and rdma tests. I take it to mean
that IB network is performing at the expected speed.
[hussaif1@lustwzb5 ~]$ qperf lustwzb4 -t 30 tcp_lat tcp_bw
tcp_lat:
latency = 24.2 us
tcp_bw:
bw = 1.19 GB/sec
[hussaif1@lustwzb5 ~]$ qperf lustwzb4 -t 30 rc_lat rc_bw
rc_lat:
latency = 7.76 us
rc_bw:
bw = 4.56 GB/sec
[hussaif1@lustwzb5 ~]$
Quoting Jeff Johnson <jeff.john...@aeoncomputing.com>:
Faraz,
I didn't notice any tests where you actually tested the ip layer. You
should run some iperf tests between nodes to make sure ipoib functions.
Your infiniband/rdma can be working fine and ipoib can be dysfunctional.
You need to ensure the ipoib configuration, like any ip environment, is
configured the same on all nodes (network/subnet, netmask, mtu, etc) and
that all of the nodes are configured for the same mode (connected vs
datagram). If you can't run iperf then there is something broken in the
ipoib configuration.
--Jeff
On Thu, Aug 3, 2017 at 8:41 AM, Faraz Hussain <i...@feacluster.com> wrote:
Thanks for everyone's help. Using the Ohio State tests, qperf and
perfquery I am convinced the IB network is working. The only thing that
still bothers me is I can not get mpirun to use the tcp network. I tried
all combinations of --mca btl to no avail. It is not important, more just
curiosity.
Quoting Michael Di Domenico <mdidomeni...@gmail.com>:
On Thu, Aug 3, 2017 at 10:10 AM, Faraz Hussain <i...@feacluster.com>
wrote:
Thanks, I installed the MPI tests from Ohio State. I ran osu_bw and got
the
results below. What is confusing is I get the same result if I
use tcp or
openib ( by doing --mca btl openib|tcp,self with my mpirun command ). I
also
tried changing the environment variable: export OMPI_MCA_btl=tcp,self,sm
.
Results are the same regardless of tcp or openib..
And when I do ifconfig -a I still see zero traffic reported for the ib0
and
ib1 network.
if openmpi uses RDMA for the traffic ib0/ib1 will not show traffic,
you have to use perfquery
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
--
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing
jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001 f: 858-412-3845
m: 619-204-9061
4170 Morena Boulevard, Suite D - San Diego, CA 92117
High-Performance Computing / Lustre Filesystems / Scale-out Storage
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf