We recently tested 48 port gigabit switches from Extreme (summit-48t) and Force10 (s50). We found the Extreme networks switch performed better than the Force10 when all 48 ports were active. The s50 appeared to "choke" at certain message sizes, leading to erratic rates and overall reduced performance. The summit was much smoother with very little variation in throughput. For example a bidirectional edge exchange had a max throughput of 1540Mbps under LAM (using the Broadcom NIC) while 16 pairs (32 nodes) had a max throughput of 1520Mbps per pair; the optimum message size was about 250KBytes. We also tested 2 switches connected by a 10G stacking cable. We could connect 12 pairs of ports (12 on each switch) and run at essentially the same speed (around 1500Mbps per pair) through the stacking cable.
There are a lot of hidden gotchas in switch technology so "wire speed" means next to nothing. For example the Force 10 switch (which is a good edge switch) has 4 12 port ASIC's. Ports on the same ASIC really do communicate at wire speed, but between ASIC's the max bandwidth is 10Gbps, so the max throughput is only 83% of what you would expect. By contrast the Extreme switch is supposedly "flat", with full bandwidth under all port configurations. The Broadcom NICS could not push data fast enough to really stress the Extreme switch (only about 1500Mbps max per pair) but with MPIGAMMA I can get over 1800MBps between pairs which will up the load on the switch. These switches are not cheap; they list for $6000-8000 but they outperform the cheaper switches by a considerable margin. We have not been able to get close to the theoretical bandwidth from our cheap GigE switches (HP 2724 3Com SS3). I have recently run netpipe with MPI/GAMMA (http://www.disi.unige.it/project/gamma/mpigamma/) using two Intel PRO1000 NIC's (82545GM) wired back-to-back. The nodes are Dell PE850 with 3.0Ghz P4D (dual core). MPI latency was 8.6 microsecs one way and 8.8 microsecs for bidirectional messages. The max throughput was 983Mbps one way and 1856Mbps bidirectional. The half throughput message size is about 4KBytes. These are consistent with pingpong tests reported on the MPIGAMMA website. The higher throughput will enable a better test of switch performance We have just installed a stacked array of 4 X summit-48t's. I will post benchmarks soon. Tony ------------------------------- Tony Ladd Professor, Chemical Engineering University of Florida PO Box 116005 Gainesville, FL 32611-6005 Tel: 352-392-6509 FAX: 352-392-9513 Email: [EMAIL PROTECTED] Web: http://ladd.che.ufl.edu _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf