On Fri, 2015-01-23 at 01:36 -0500, Mark Hahn wrote: > Hi all, > I'd appreciate any comments about the state of 10G as a reasonable > cluster network. Have you done any recent work on 10G performance? > > https://lwn.net/Articles/629155/
I had a go at using RoCE with some Mellanox NICs a year or so ago (which uses the OFED stack). I don't have any actual numbers unfortunately but it's certainly possible to get "decent" performance at least with modest node counts, although I think Infiniband will perform better. There are a couple of issues though: - The configuration, particularly at the switch end, is fairly esoteric. Mellanox do have some pretty good documentation for their cards but if you've got a switch from a different manufacturer you may have to play around a bit. It's certainly rather more involved than getting Infinband working. - To get decent performance (at least on latency) you need fairly high end HCAs, a switch which supports the DCB stuff (I think?) and (Q)SFP+ transceivers / cables, the cost of which is in the same area as Infiniband. There are a couple of advantages to 10GE, in that there are ASICs with higher port counts and it's easy to integrate with an existing ethernet network but for MPI performance I think Infiniband is still the way to go. It occurred to me the other day that it's about time we had something better than 1GE for commodity networking. It's good news that switch costs are coming down but I've yet to see a server with an onboard 10gT adaptor (although I have seen some with SFP+ 10g). Rob _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf