On Mon, Oct 16, 2017 at 7:16 AM, Peter Kjellström <c...@nsc.liu.se> wrote: > Another is that your MPIs tried to use rdmacm and that in turn tried to > use ibacm which, if incorrectly setup, times out after ~1m. You can > verify ibacm functionality by running for example: > > user@n1 $ ib_acme -d n2 > ... > user@n1 $ > > This should be near instant if ibacm works as it should.
i didn't specifically tell mpi to use one connection setup vs another, but i'll see if i can track down what openmpi is doing in that regard. however, your test above fails on my machines user@n1# ib_acme -d n3 service: localhost destination: n3 ib_acm_resolve_ip failed: cannot assign requested address return status 0x0 in the /etc/rdma/ibacme_addr.cfg file i just lists the data specific to each host, which is gathered by ib_acme -A truthfully i never configured, i though it just "worked" on it's own, but perhaps not. i'll have to google some _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf