Re: [Beowulf] commercial clusters

Buccaneer for Hire. Sat, 30 Sep 2006 16:04:32 -0700

Our  traditional software has a methodology for striping across a number of 
nodes and we have done that for years.  The new software is different-they will 
build a 10-12GB sparse file, for instance, and as each of these nodes finish 
they will update the information in their portion of the file.  the problem is 
the head node is bragging about doing close to 200MB/sec over NFS while the EMC 
is telling us it's pushing 25MB/sec.

So if I choose which GFS to test, to my way of thinking, it will need the 
ability to write to the same file across multiple head nodes.

----- Original Message ----
From: Stu Midgley <[EMAIL PROTECTED]>
To: Buccaneer for Hire. <[EMAIL PROTECTED]>
Cc: Beowulf List <[email protected]>
Sent: Friday, September 29, 2006 10:10:29 PM
Subject: Re: [Beowulf] commercial clusters

hmmm...  200 nodes writing to the same file.  That is a hard problem.
In all my testing of global FS's I haven't found one that is capable
of doing this while delivering good performance.  One might think that
that MPI-IO would deliver performance while writing to the same file
(on something like lustre) but in my experience, MPI-IO is more about
functionality not performance.

In any code that I write that needs lot of bandwidth, I always write
an n-m io routine.  That is, your n processor task can read the
previous m checkpoint-chunks (produced from an earlier m processor
job).  Then, when writing out the checkpoint or output file, you get
each process to open its own individual file and dump its data to it.
This gives you maximum bandwidth and stops meta-data thrashing on your
cluster FS.  It is also quite easy to write single-cpu tools which
concatenate the files together...

Alternatively, you can write a simple client-side FUSE file system
which sort of joins multiple NFS mounts together into a single FS.  In
this way, you can stripe your IO over multiple NFS mounts...  very
similar to the cluster file system that was present in the
Digital/Compaq SC machines.  In this fashon, your file in the FUSE FS
looks consistent and coherent while in the underlying nfs directories
you see your file split up into bits (file.1 file.2 file.3 file.4 etc
for a 4 nfs mount system).  A simple way to get your bandwidth up
(especially if your nfs mounts are coming in over different gig-e
nics) but still gives REALLY crap bandwidth when trying to have
multiple threads writing to the same file...

Try Lustre :)

_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] commercial clusters

Reply via email to