On 2/27/25 3:19 AM, Brice Goglin wrote:
Hello

While meeting vendors to buy our next cluster, we got different recommendations about the network for MPI. The cluster will likely be about 100 nodes. Some vendors claim RoCE is enough to get <2us latency and good bandwidth for such low numbers of nodes. Some others say RoCE is far behind IB for both latency and bandwidth and we likely need to get IB if we care about network performance.

If anybody tried MPI over RoCE over such a "small" cluster, what NICs and switches did you use?

Also, is the configuration easy from the admin (installation) and users (MPI options) points of view?


I hope this isn't a dumb question: Do the Ethernet switches you're looking at have crossbar switches inside them? I believe crossbar switches are a requirement for IB, but are only found in "higher performance" Ethernet switches. IB isn't just about latency. The crossbar switches allow for high bisectional bandwidth, non-blocking communication, etc.


--
Prentice

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to