Hi!

The diskless provisioning system is definitely the way to go. We use the cluster toolkit called, Jesswulf, which is available at

<advertisement>

http://hpc.arc.georgetown.edu/mirror/jesswulf/

By default it runs on RedHat/Centos/Fedora systems, though it has been ported to Ubuntu and SuSE without too much trouble. Perseus/Warewulf also work well. We also teach cluster courses, which may be helpful.

http://training.arc.georgetown.edu/

</advertisement>

To answer some of your questions, I prefer the read-only NFSROOT approach with a small (less than 20 MB ramdisk). We use this on all of our clusters (about 7 clusters) and it works fine. We even use it on heterogeneous systems. One cluster has a mix of P4 Xeons, dual-core Opterons, and quad-core Xeons all using the same NFSROOT so you simply update one directory on the master node and *all* of the compute nodes have the new software. We love it! We simply either compile the kernel or make the initrd with hardware support for all of the nodes. We often use different hardware for the master and compute nodes, without issue. The only thing that we don't mix is 32 and 64-bit. We have a couple of 32-bit clusters and the rest are 64-bit.

The main issue that you need to deal with is having a fast enough storage system for parallel jobs that generate a lot of data. We use the local hard drives in the computes nodes for "scratch" space and we have some type of shared file system. On the small clusters, we use NFS, but on the bigger clusters we use Glusterfs with Infiniband, which has proven to be very nice. If you are running MPI jobs with lots of data, you might want to consider adding Infiniband. Even the cheap ($125) Infiniband cards give much better performance than standard Gigabit. And you can always run IP over IB for applications or services that need standard IP.

You mention that you don't think that you will have too much MPI traffic, but that you will be copying the results back to the master. This is when we see the highest load on our NFS file systems when all of the compute nodes are writing at the same time, even on small clusters (less than 20 nodes). We've found that a clustered file system like Glusterfs provides very low I/O wait load when copying lots of files compared to NFS. You may consider picking up some of the cheap IB cards ($125) and switches ($750 for 8-ports/$2400 for 24-ports) in order to do some relatively inexpensive testing. Here is one place where you can find them:

http://www.colfaxdirect.com/store/pc/viewCategories.asp?pageStyle=m&idCategory=6

I'd be happy to talk to you. My phone number is below and you have my e-mail.

Jess

--
Jess Cannata
Advanced Research Computing &
High Performance Computing Training
Georgetown University
202-687-3661



P.R. wrote:
Hi,
Im new to the list & also to cluster technology in general.
Im planning on building a small 20+node cluster, and I have some basic
questions.
We're planning on running 5-6 motherboards with quad-core amd 3.0GHz
phenoms, and 4GB of RAM per node.
Off the bat, does this sound like a reasonable setup

My first question is about node file&operating systems:
I'd like to go with a diskless setup, preferably using an NFS root for each
node.
However, based on some of the testing Ive done, running the nodes off of the
NFS share(s) has turned out to be rather slow & quirky.
Our master node will be running on a completely different hardware setup
than the slaves, so I *believe* it will make it more complicated & tedious
to setup&update the nfsroots for all of the nodes (since its not simply a
matter of 'cloning' the master's setup&config). Is there any truth to this, am I way off?

Can anyone provide any general advice or feedback on how to best setup a
diskless node?


The alternative that I was considering was using (4GB?) USB flash drives to
drive a full-blown,local OS install on each node...
Q: does anyone have experience running a node off of a usb flash drive?
If so, what are some of the pros/cons/issues associated with this type of
setup?


My next question(s) is regarding network setup.
Each motherboard has an integrated gigabit nic.

Q: should we be running 2 gigabit NICs per motherboard instead of one?
Is there a 'rule-of-thumb' when it comes to sizing the network requirements?
(i.e.,'one NIC per 1-2 processor cores'...)


Also, we were planning on plugging EVERYTHING into one big (unmanaged)
gigabit switch.
However, I read somewhere on the net where another cluster was physically
separating NFS & MPI traffic on two separate gigabit switches.
Any thoughts as to whether we should implement two switches, or should we be
ok with only 1 switch?


Notes:
The application we'll be running is NOAA's wavewatch3, in case anyone has
any experience with it.
It will utilize a fair amount of NFS traffic (each node must read a common
set of data at periodic intervals), and I *believe* that the MPI traffic is not extremely heavy or constant (i.e., nodes do large amounts of independent processing before sending
results back to master).


Id appreciate any help or feedback anyone would be willing&able to offer...

Thanks,
P.Romero

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf



_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to