Glen,

I have had great success with the *right* 10GbE nic and NFS. The important things to consider are:

How much bandwidth will your backend storage provide? 2 x FC 4 I'm guessing best case is 600Mb but likely less.
What access patterns do the "typical apps" have?
All nodes read from a single file (no prob for NFS, and fscache may help even more) All nodes write to a single file (NFS may need some help or may be too slow when tuned for this) All nodes read and write to separate files (NFS is fine if the files aren't too big for the OS to cache reasonably).

The number of IO servers really is a function of how much disk throughput you have on the backend, frontend, and through the kernel/ filesystem goo. My experience is a 10GbE nic from Myricom can easily sustain 500-700MB/s if the storage behind it can and the access patterns aren't evil. Other nics from large and small vendors can fall apart at 3-4 Gb so be careful and test the network first before assuming your FS is the troublemaker. There are cheap switches with 2 or 4 10GbE CX4 connectors that make this much simpler and safer with or without the Parallel FS options.

Depending on how big/small and how "scratch" the need is... a big tmpfs/ramdisk can be fun :)

Good luck!
Greg



On Sep 25, 2008, at 9:01 AM, [EMAIL PROTECTED] wrote:

Date: Thu, 25 Sep 2008 09:40:54 -0400
From: Glen Beane <[EMAIL PROTECTED]>
Subject: [Beowulf] scratch File system for small cluster
To: "beowulf@beowulf.org" <beowulf@beowulf.org>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="iso-8859-1"

I am considering adding a small parallel file system ~(5-10TB) my small cluster (~32 2x dual core Opteron nodes) that is used mostly by a handful of regular users. Currently the only storage accessible to all nodes is home directory space which is provided by the Lab's IT department (this is a SAN volume connected to the head node by 2x FC links, and NFS exported to the compute nodes). I don't have to "worry" about the IT provided SAN space - they back it up, provide redundant hardware, etc. The parallel file system would be scratch space (and not backed up by IT). We have a mix of home grown apps doing a pretty wide range of things (some do a lot of I/ O, others
don't), and things like BLAST and BLAT.

Can anyone out there provide recommendations for a good solution for fast
scratch space for a cluster of this size?

Right now I was thinking about PVFS2. How many I/O servers should I have,
and how many cores and RAM per I/O server?
Are there other recommendations for fast scratch space (it doesn't have to
be a parallel file system, something with less hardware would be nice)

--
Glen L. Beane
Software Engineer
The Jackson Laboratory
http://www.jax.org

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to