On Tue, 17 Oct 2017 10:59:41 -0400 Michael Di Domenico <mdidomeni...@gmail.com> wrote: ... > > I_MPI_DAPL_PROVIDER=ofa-v2-mlx4_0-1u mpiexec.hydra ... > > > > You may want to set I_MPI_DEBUG=4 or so to see what it does. > > i can confirm that the dapl test with intelmpi is pretty speedy.
It may be interesting to see what it picks by default (compare output with I_MPI_DEBUG).. > when i startup an mpi job without dapl enabled it takes ~60 seconds I think you mean using the default dapl provider (vs. the specific ucm provider I suggested). IntelMPI should default to dapl on Mellanox regardless of version I think (unless possibly if your IntelMPI is very new and you have a libfabric version installed...). > before the test actually starts, with dapl enabled it's only a few > seconds. That is still very slow. For reference I timed 1024 rank startup on one of our systems with IntelMPI and dapl on ucm and it's a bit below 0.5s depending on how you time it (some amount of lazy init is happening). If I force IntelMPI on that system to run using verbs, I_MPI_FABRICS=ofa, then that startup takes 5 seconds (~10x slower). I have not tested a dapl provider using rdmacm as that would require me to change our system dat.conf I think.. Either way, with 60s time scales and ibacm so broken it fails instantly I suspect you have some hostname/dns/tcp-ip-on-eth or other fundamental problem somewhere. /Peter K _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf