Hi all, Just wondering if any of you have numbers (or experience) with modern high-speed COTS ethernet.
Latency mainly, but perhaps also message rate. Also ease of use with open-source products like OpenMPI, maybe Lustre? Flexibility in configuring clusters in the >= 1k node range? We have a good idea of what to expect from Infiniband offerings, and are familiar with scalable network topologies. But vendors seem to think that high-end ethernet (100-400Gb) is competitive... For instance, here's an excellent study of Cray/HP Slingshot (non-COTS): https://arxiv.org/pdf/2008.08886.pdf (half rtt around 2 us, but this paper has great stuff about congestion, etc) Yes, someone is sure to say "don't try characterizing all that stuff - it's your application's performance that matters!" Alas, we're a generic "any kind of research computing" organization, so there are thousands of apps across all possible domains. Another interesting topic is that nodes are becoming many-core - any thoughts? Alternatively, are there other places to ask? Reddit or something less "greybeard"? thanks, mark hahn McMaster U / SharcNET / ComputeOntario / DRI Alliance Canada PS: the snarky name "NVidiband" just occurred to me; too soon? _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf