On Wed, 2 Sep 2009 at 10:29pm, Rahul Nabar wrote
That brings me to another important question. Any hints on speccing
the head-node? Especially the kind of storage I put in on the head
node. I need around 1 Terabyte of storage. In the past I've uses
RAID5+SAS in the server. Mostly for running jobs that access their I/O
via files stored centrally.
For muscle I was thinking of a Nehalem E5520 with 16 GB RAM. Should I
boost the RAM up? Or any other comments. It is tricky to spec the
central node.
Or is it more advisable to go for storage-box external to the server
for NFS-stores and then figure out a fast way of connecting it to the
server. Fiber perhaps?
Speccing storage for a 300 node cluster is a non-trivial task and is
heavily dependent on your expected access patterns. Unless you anticipate
vanishingly little concurrent access, you'll be very hard pressed to
service a cluster that large with a basic Linux NFS server. About a year
ago I had ~300 nodes pointed at a NetApp FAS3020 with 84 spindles of 10K
RPM FC-AL disks. A single user could *easily* flatten the NetApp (read:
100% CPU and multi-second/minute latencies for everybody else) without
even using the whole cluster.
Whatever you end up with for storage, you'll need to be vigilant regarding
user education. Jobs should store as much in-process data as they can on
the nodes (assuming you're not running diskless nodes) and large jobs
should stagger their access to the central storage as best they can.
--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf