Cool, FNN's are still being mentioned on the Beowulf mailing list... For those not familiar with the Flat Neighborhood Network (FNN) idea, check out this URL: http://aggregate.org/FNN/
For those who haven't played with our FNN generator cgi script, do try it out. Hank (my Ph.D. advisor) enhanced the cgi awhile back to generate pretty multi-color pictures of the resulting FNNs. Unfortunately, for the particular input parameters from this thread of six 24-port switches and 50 nodes, each node would need a 3-port HCA (or 3 HCAs) and a 7th switch to generate a Universal FNN. FNNs don't really shine until you have 3 or 4 NICs/HCAs per compute node. Anyway, you would get a LOT more bandwidth with an FNN in this case... and of course, the "single-switch-latency" that is characteristic of FNNs. Though, as others have mentioned, IB switch latency is pretty darn small, so latency would not be the primary reason to use FNNs with IB. I wonder if anyone has built a FNN using IB... or for that matter, any link technology other than Ethernet? On Thu, Jul 24, 2008 at 5:00 PM, Mark Hahn <[EMAIL PROTECTED]> wrote: >> Well the top configuration(and the one that I suggested) is the one >> that we have tested and know works. We have implimented it into >> hundereds of clusters. It also provides redundancy for the core >> switches. > > just for reference, it's commonly known as "fat tree", and is indeed > widely used. > >> With any network you need to avoid like the plauge any kind of loop, >> they can cause weird problems and are pretty much unnessasary. for > > well, I don't think that's true - the most I'd say is that given > the usual spanning-tree protocol for eth switches, loops are a bug. > but IB doesn't use eth's STP, and even smarter eth networks can take > good advantage of multiple paths, even loopy ones. > >> instance, why would you put a line between the two core switches? Why >> would that line carry any traffic? > > indeed - those examples don't make much sense. but there are many others > that involve loops that could be quite nice. consider 36 nodes: with > 2x24pt, you get 3:1 blocking (6 inter-switch links). with 3 switches, you > can do 2:1 blocking (6 interlinks in a triangle, forming a loop.) > dual-port nics provide even more entertainment (FNN, but also the ability to > tolerate a leaf-switch failure...) > >> When you consider that it takes 2-4ìs for an mpi message to get from > > depends on the nic - mellanox claims ~1 us for connectx (haven't seen it > myself yet.) I see 4-4.5 us latency (worse than myri 2g mx!) on > pre-connectx > mellanox systems. > >> one node to another on the same switch, each extra hop will only >> introduce another 0.02ìs (I think?) to that latency so its not really > > with current hardware, I think 100ns per hop is about right. mellanox > claims > 60ns for the latest stuff. > >> Most applications dont use anything like the full bandwidth of the >> interconnect so the half bisectionalness of everything can generally >> be safeley ignored. > > everything is simple for single-purpose clusters. for a shared cluster > with a variety of job types, especially for large user populations, large > jobs and large clusters, you want to think carefully about how much to > compromise the fabric. consider, for instance, interference between a > bw-heavy weather code and some latency-sensitive application (big and/or > tightly-coupled.) > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ [EMAIL PROTECTED] || [EMAIL PROTECTED] I'm a bright... http://www.the-brights.net/ _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf