i'm seeing issues on a mellanox fdr10 cluster where the mpi setup and
teardown takes longer then i expect it should on larger rank count
jobs.  i'm only trying to run ~1000 ranks and the startup time is over
a minute.  i tested this with both openmpi and intel mpi, both exhibit
close to the same behavior.

has anyone else seen this or might know how to fix it?  i expect ~1000
ranks to take sometime to setup, but it seems to be taking longer then
i think it should
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to