On Fri, 28 Apr 2006, David Kewley wrote:
By the way, the idea of rolling-your-own hardware on a large cluster, and planning on having a small technical team, makes me shiver in horror. If you go that route, you better have *lots* of experience in clusters. and make very good decisions about cluster components and management methods. If you don't, your users will suffer mightily, which means you will suffer mightily too.
I >>have<< lots of experience in clusters and have tried rolling my own nodes for a variety of small and medium sized clusters. Let me clarify. For clusters with more than perhaps 16 nodes, or EVEN 32 if you're feeling masochistic and inclined to heartache: Don't. Or you will have a really high probability of being very, very sorry. 16 node clusters I've done "ok" with, in the sense that the problems were manageable. >32 node clusters, especially if you encounter ANY ex post facto problems with the hardware configuration -- including ones that passed through your original prototyping runs (and yeah, they exist) -- rapidly descend into circle of hell type experiences. Expensive ones. Much more expensive in real money, let alone time, than just buy nodes from a quality vendor of nodes with a 3-4 year onsite service contract, so if they break they'll come fix them (but they don't break -- see word "quality" in the above:-). Other than thinking that "shiver in horror" is somehow inadequate to describe the potential for misery, I endorse pretty much everything else David (and Mark) said -- both these guys know whereof they speak. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:[EMAIL PROTECTED] _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf