Hi, Andrew Robbie (GMail) schrieb: > I am building a small (~16) node cluster with an IB interconnect. I need to > decide whether I will buy a cheaper, dumb switch and run OpenSM, or get a > more expensive switch with a built in subnet manager. The largest this > system would every grow is 32 nodes (two 24 port switches). > > Various vendors (integrators, not switch OEMs) have stated to me that > managed switches are the go, and that OpenSM is (a) buggy, and (b) very > time consuming to set up.
It's not _that_ buggy and set up is pretty straigt forward. But it lacks several features you'd really like in big systems. For fewer or equal to 24 nodes you can go with a simple switch and OpenSM. For 32 nodes you can use 16 nodes per switch and 8 cables for switch interconnect. So you should have 1/2 bisection bandwith in theory. But OpenSM configures IB forwarding rather static at startup and never adjusts it to actual usage of links and is rather poor to "hotplug" changes in topology. So it is possible that some links are overused but others not. Nevertheless you can still find 24 nodes in your 32 nodes cluster communicating nonblocking (if remaining 8 stay silent), but I don't know a simple way to get this information from OpenSM or switch. You can write a simple MPI program benchmarking it. In addition the versions of OpenSM I know crash silently sometimes (which does not affect anything), so you should monitor it in some way (you can restart it whenever you want). Finally I have to admit that this are all real life experiences without any deep inside knowledge of OpenSM or even Infiniband. So, as a conclusion I would suggest to go with a simple 24port switch and OpenSM for now. If you upgrade to more than 24 nodes you should add a more advanced switch. From my experience you can easily mix Mellanox switches with those formerly known as TopSpin, I don't know about other vendors. As one more hint you should reconsider if you need that many nodes for a job. If you limit your need of nodes for one job to 24 you can easily go with two dump 24 switches up to 48 nodes and both subclusters can communicate nonblocking. But of course this way no node of one subcluster can communicate with one of the other one and you need a resource management system able to assign nodes of subcluster to one job. Kind regards, -- Mapsolute GmbH Frank Gruellich Map24 Systems and Networks Duesseldorfer Strasse 40a 65760 Eschborn Germany Phone: +49 6196 77756-414 Fax: +49 6196 77756-100 http://www.mapsolute.com _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf