From my experience, RoCE will be just as fast if not faster than IB, inside a 
single Ethernet switch, it’s when you go outside the switch you lose out.
The trick has been finding NICs that are supported natively by OFED. I tend to 
still find the Mellanox NICs the most reliable and well supported.

Then the question is if you’re still buying the Mellanox NICs, why not go the 
whole hog, particularly as you may grow outside of a single switch.

Matt.

> On 27 Feb 2025, at 19:19, Brice Goglin <brice.gog...@gmail.com> wrote:
> 
> Hello
> 
> While meeting vendors to buy our next cluster, we got different 
> recommendations about the network for MPI. The cluster will likely be about 
> 100 nodes. Some vendors claim RoCE is enough to get <2us latency and good 
> bandwidth for such low numbers of nodes. Some others say RoCE is far behind 
> IB for both latency and bandwidth and we likely need to get IB if we care 
> about network performance.
> 
> If anybody tried MPI over RoCE over such a "small" cluster, what NICs and 
> switches did you use?
> 
> Also, is the configuration easy from the admin (installation) and users (MPI 
> options) points of view?
> 
> Thanks
> 
> Brice
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> 
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to