Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

Bogdan Costescu Tue, 30 Sep 2008 02:27:31 -0700

On Sun, 28 Sep 2008, Jon Forrest wrote:

There are two philosophies on where a compute node's OS and basicutilities should be located:

You forget a NFS-root setup, this doesn't require memory for the RAMdisk on which you later mount NFS dirs.

In both cases it's important to remember to make any changes to thisdistribution rather than just using "pdsh" or "tentakel" todynamically modify a compute node. This is so that the next time thecompute node boots, it gets the uptodate distribution.

I prefer to look at the nodes as disposable, instead of "let's keepthe node up as long as possible", so I usually don't modify a runningsystem. Instead I modify the node "image" and reboot the nodes afterthe current jobs finish - this is easy to do when using a queueingsystem and is easy to hide from users when the typical jobs are longerthan the reboot time.

However, on a modern multicore compute node this might just be a fewpercent of the total RAM on the node.

This also depends on how much of the distribution you keep as part ofthe node "image" and how you place the application software. It'soften the case that the application software is distributed to thenodes from a cluster-wide FS, either from a directory holding softwareonly or from the user's home dir; extending this to also include mostof the libraries needed by the application software (f.e. fftw) meansthat the node "image" can be made very small without putting any partof the distribution on NFS (I know that some people totally dislikesystem utilities coming from a NFS mounted directory or depending onlibs coming from one).

Approach #2 requires much less time when a node is installed,
and a little less time when a node is booted.

I don't agree with you here as you probably have in mind akickstart-based install for approach #1 running upon each node boot. Iuse for a long time a different approach - the node "image" is copiedvia rsync at boot time; the long waiting time for installing the RPMsand running whatever configuration scripts happens only once when thenode "image" is prepared, the nodes only copy it - and it's a fullcopy only the first time they boot with a new disk, afterwards it'sthe rsync magic which makes it finish within seconds while making itlook like a new disk :-)


--
Bogdan Costescu

IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850
E-mail: [EMAIL PROTECTED]
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Compute Node OS on Local Disk vs. Ram Disk

Reply via email to