Ramiro -

You might want to also consider buying just a single 24-port switch for your 22 nodes, and then when you expand either replace with a larger switch, or build a distributed switch fabric with a number of leaf switches connecting into a central spine switch (or switches). By the time you expand to the larger cluster, switches based on the announced 36-port Mellanox crossbar silicon will be available and perhaps per port prices will have dropped sufficiently to justify the purchase delay and the disruption at the time of expansion.

If your applications can tolerate some oversubscription (less than a 1:1 ratio of leaf-to-spine uplinks to leaf-to-node connections), a distributed switch fabric (leaf and spine) has the advantage of shorter (and cheaper) cables between the leaf switches and your nodes, and relatively fewer longer cables from the leaves back to the spine, compared with a single central switch.

We have many Flextronics switches - SDR and DDR, 24-port and 144-port - on a pair of large clusters (520 nodes, and 600 nodes) built in 2005 and 2006. No complaints. But, we have been self-supporting, and I would guess you would have very different support structures with Voltaire or Qlogic. With the Flextronics
switches you will definitely be using the OFED stack, and you will have to run
a subnet manager on one of your nodes (dedicated is probably best).  You could
optionally buy an embedded subnet manager on the Voltaire or Qlogic switches,
depending upon model, though I believe for a large fabric an external subnet
manager is still recommended.

Don Holmgren
Fermilab




On Tue, 10 Jun 2008, Ramiro Alba Queipo wrote:

Hello everybody:

We are about to build an HPC cluster with infiniband network starting
from 22 dual socket nodes with AMD QUAD core processors and in a year or
so we will be having about 120 nodes. We will be using infiniband both
for calculation as for storage.
The question is that we need a modular solution and we are having 3
candidates:

a) Voltaire Grid Director SDR or DDR 288 ports (9988 or 2012 models)->
seems very good and well supported, but very expensive.

b) Qlogic SilverStorm 9120 (144 ports) -> no price and support
information yet

c) Flextronics 10U 144 Port Modular-> very good at price but little
support => risky option?.

I am in a mess. What is your opinion about this matter? Are you using
any of this products.

Regards
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to