Hi Don, Don Holmgren wrote:
latency difference here matters to many codes). Perhaps of more significance, though, is that you can use oversubscription to lower the cost of your fabric. Instead of connecting 12 ports of a leaf switch to nodes and using the other 12 ports as uplinks, you might get away with 18 nodes and 6 uplinks, or 20 nodes and 4 uplinks. As core counts are increasing, this is becoming more and more viable for some applications.
It's important to note that the "full-bisection" touted by vendors is on paper only. In reality, static routing provides full-bisection for a very small subset of patterns, the average effective bisection on a diameter-3 Clos is ~40% of link rate (adaptive routing improves that a lot, but breaks packet order on the wire which is a requirement for some network protocols).
In practice, "paper" full-bisection is near free when using a single enclosure, since all spine cables are on the backplane. For larger networks, where you have to pay for real cables to the spine level, then it may make sense to be oversubscribed if the effective bisection is already bad (static routing), or if your collective communication on large jobs are not bandwidth bounded. However, the later is often false on many-cores.
Patrick _______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
