On Wed, 2 Sep 2009 at 10:29pm, Rahul Nabar wrote

That brings me to another important question. Any hints on speccing
the head-node? Especially the kind of storage I put in on the head
node. I need around 1 Terabyte of storage. In the past I've uses
RAID5+SAS in the server. Mostly for running jobs that access their I/O
via files stored centrally.

For muscle I was thinking of a Nehalem E5520 with 16 GB RAM. Should I
boost the RAM up? Or any other comments. It is tricky to spec the
central node.

Or is it more advisable to go for storage-box external to the server
for NFS-stores and then figure out a fast way of connecting it to the
server. Fiber perhaps?

Speccing storage for a 300 node cluster is a non-trivial task and is heavily dependent on your expected access patterns. Unless you anticipate vanishingly little concurrent access, you'll be very hard pressed to service a cluster that large with a basic Linux NFS server. About a year ago I had ~300 nodes pointed at a NetApp FAS3020 with 84 spindles of 10K RPM FC-AL disks. A single user could *easily* flatten the NetApp (read: 100% CPU and multi-second/minute latencies for everybody else) without even using the whole cluster.

Whatever you end up with for storage, you'll need to be vigilant regarding user education. Jobs should store as much in-process data as they can on the nodes (assuming you're not running diskless nodes) and large jobs should stagger their access to the central storage as best they can.

--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to