Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Faraz Hussain
Here are the latency numbers when running the Ohio State test: mpirun -np 2 -machinefile hostfile ./osu_latency # OSU MPI Latency Test v5.3.2 # Size Latency (us) 0 1.57 1 1.22 2 1.19 4 1.20 8

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Jon Tegner
Isn't latency over RDMA a bit high? When I've tested QDR and FDR I tend to see around 1 us (using mpitests-osu_latency) between two nodes. /jon On 08/03/2017 06:50 PM, Faraz Hussain wrote: Here is the result from the tcp and rdma tests. I take it to mean that IB network is performing at the ex

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Faraz Hussain
Here is the result from the tcp and rdma tests. I take it to mean that IB network is performing at the expected speed. [hussaif1@lustwzb5 ~]$ qperf lustwzb4 -t 30 tcp_lat tcp_bw tcp_lat: latency = 24.2 us tcp_bw: bw = 1.19 GB/sec [hussaif1@lustwzb5 ~]$ qperf lustwzb4 -t 30 rc_lat rc

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Gus Correa
Hi Faraz +1 to John's suggestion of joining the Open MPI list. Your questions are now veering towards Open MPI specifics, and you will get great feedback on this topic there. If you want to use TCP/IP instead of RDMA (say, IPoIB or Gigabit Ethernet cards), you can use it if you tell Open MPI not

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Jeff Johnson
Faraz, I didn't notice any tests where you actually tested the ip layer. You should run some iperf tests between nodes to make sure ipoib functions. Your infiniband/rdma can be working fine and ipoib can be dysfunctional. You need to ensure the ipoib configuration, like any ip environment, is conf

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread John Hearns via Beowulf
Faraz, do you mean the IPOIB tcp network, ie the ib0 interface? Good question. I would advise joining the Openmpi list. They are very friendly over there. I have always seen polite and helpful replies even to dumb questions there (such as the ones I ask). I actually had to do something similar re

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Faraz Hussain
Thanks for everyone's help. Using the Ohio State tests, qperf and perfquery I am convinced the IB network is working. The only thing that still bothers me is I can not get mpirun to use the tcp network. I tried all combinations of --mca btl to no avail. It is not important, more just curios

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread John Hearns via Beowulf
Fazar, I think that you have got things sorted out. However I think that the number of optiosn in OpenMPI is starting to confuse you. But do not lose heart! I have been in the same place myself many time. Specifically I am thinking on one time when a customer asked me to benthmark the latency acr

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Michael Di Domenico
On Thu, Aug 3, 2017 at 10:10 AM, Faraz Hussain wrote: > Thanks, I installed the MPI tests from Ohio State. I ran osu_bw and got the > results below. What is confusing is I get the same result if I use tcp or > openib ( by doing --mca btl openib|tcp,self with my mpirun command ). I also > tried cha

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Faraz Hussain
Thanks, I installed the MPI tests from Ohio State. I ran osu_bw and got the results below. What is confusing is I get the same result if I use tcp or openib ( by doing --mca btl openib|tcp,self with my mpirun command ). I also tried changing the environment variable: export OMPI_MCA_btl=tcp

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Joe Landman
On 08/03/2017 09:21 AM, Faraz Hussain wrote: I ran the qperf command between two compute nodes ( b4 and b5 ) and got: [hussaif1@lustwzb5 ~]$ qperf lustwzb4 -t 30 rc_lat rc_bi_bw rc_lat: fd latency = 7.73 us rc_bi_bw: bw = 9.06 GB/sec If I understand correctly, I would need to ena

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread Faraz Hussain
I ran the qperf command between two compute nodes ( b4 and b5 ) and got: [hussaif1@lustwzb5 ~]$ qperf lustwzb4 -t 30 rc_lat rc_bi_bw rc_lat: fd latency = 7.73 us rc_bi_bw: bw = 9.06 GB/sec If I understand correctly, I would need to enable ipoib and then rerun test? It would then s

Re: [Beowulf] How to know if infiniband network works?

2017-08-03 Thread tegner
I often use mpirun --np 2 --machinefile mpd.hosts mpitests-osu_latency mpirun --np 2 --machinefile mpd.hosts mpitests-osu_bw To test bandwidth and latency between to specific nodes (listed in mpd.hosts). On a CentOS/Redhat system these can be installed from the package mpitests-openmpi. /jon O