Glen,
I have had great success with the *right* 10GbE nic and NFS. The
important things to consider are:
How much bandwidth will your backend storage provide? 2 x FC 4 I'm
guessing best case is 600Mb but likely less.
What access patterns do the "typical apps" have?
All nodes read from a single file (no prob for NFS, and fscache may
help even more)
All nodes write to a single file (NFS may need some help or may be too
slow when tuned for this)
All nodes read and write to separate files (NFS is fine if the files
aren't too big for the OS to cache reasonably).
The number of IO servers really is a function of how much disk
throughput you have on the backend, frontend, and through the kernel/
filesystem goo. My experience is a 10GbE nic from Myricom can easily
sustain 500-700MB/s if the storage behind it can and the access
patterns aren't evil. Other nics from large and small vendors can
fall apart at 3-4 Gb so be careful and test the network first before
assuming your FS is the troublemaker. There are cheap switches with 2
or 4 10GbE CX4 connectors that make this much simpler and safer with
or without the Parallel FS options.
Depending on how big/small and how "scratch" the need is... a big
tmpfs/ramdisk can be fun :)
Good luck!
Greg
On Sep 25, 2008, at 9:01 AM, [EMAIL PROTECTED] wrote:
Date: Thu, 25 Sep 2008 09:40:54 -0400
From: Glen Beane <[EMAIL PROTECTED]>
Subject: [Beowulf] scratch File system for small cluster
To: "beowulf@beowulf.org" <beowulf@beowulf.org>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="iso-8859-1"
I am considering adding a small parallel file system ~(5-10TB) my
small
cluster (~32 2x dual core Opteron nodes) that is used mostly by a
handful of
regular users. Currently the only storage accessible to all nodes
is home
directory space which is provided by the Lab's IT department (this
is a SAN
volume connected to the head node by 2x FC links, and NFS exported
to the
compute nodes). I don't have to "worry" about the IT provided SAN
space -
they back it up, provide redundant hardware, etc. The parallel file
system
would be scratch space (and not backed up by IT). We have a mix of
home
grown apps doing a pretty wide range of things (some do a lot of I/
O, others
don't), and things like BLAST and BLAT.
Can anyone out there provide recommendations for a good solution for
fast
scratch space for a cluster of this size?
Right now I was thinking about PVFS2. How many I/O servers should I
have,
and how many cores and RAM per I/O server?
Are there other recommendations for fast scratch space (it doesn't
have to
be a parallel file system, something with less hardware would be nice)
--
Glen L. Beane
Software Engineer
The Jackson Laboratory
http://www.jax.org
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf