Joe; In relation to your Perl Motto; I'm more then appear that there is always more then one way to skin a cat and great debate will surround the subject. Sometimes the exercise can be useful, if not bloody.
Unfortunately for me I'm not currently in a decision maker position on any of this and am being "directed" to do certain things in conjunction with a path that somebody already established, but it was in their mind not written down. The system's compute nodes were originally built to be "Stateful" and the current power player on my team wants it to remain that way. As things sit as of today I'm looking at either using AutoYast and am also evaluating Xcat to perform the task. The biggest issue with AutoYast is that it will assist me in getting the OS out to the Nodes; it really doesn't provide any of the Cluster Management Tools that I would like to get installed. Now you maybe asking yourself why "Stateful" Compute Nodes as I did. It appears to me at this time that along with occasionally using these nodes as part of a Cluster, they also use them as plain old Servers/Workstations as I've found User Accounts & home directories on some of the compute nodes. As I said in my first post I'm new to this position & organization and not quite sure with exactly how & for what the system is even used for. I was simply told to get'er up. Steven A. Herborn U.S. Naval Academy Advanced Research Computing 410-293-6480 (Desk) 757-418-0505 (Cell) -----Original Message----- From: Joe Landman [mailto:[EMAIL PROTECTED] Sent: Monday, December 08, 2008 1:35 PM To: Steve Herborn Cc: beowulf@beowulf.org Subject: Re: [Beowulf] Personal Introduction & First Beowulf Cluster Question Steve Herborn wrote: > > > Good day to the group. I would like to make a brief introduction to > myself and raise my first question to the forum. > > > > My name is Steve Herborn and I am a new employee at the United States > Naval Academy in the Advanced Research Computing group which supports Greetings Steve > the IT systems used for faculty research. Part of my responsibilities > will be the care & feeding of our Beowulf Cluster which is a > commercially procured Cluster from Aspen Systems. It purchased & > installed about four or five years ago. As delivered the system was > originally configured with two Head nodes each with 32 compute nodes. > One head node was running SUSE 9.x and the other Head Node was running > // Scyld (version unknown) also with 32 compute nodes. While I don't > know all of the history, apparently this system was not very actively > maintain and had numerous hardware & software issues, to include losing > the array on which Scyld was installed. //Prior to my arrival a Ouch ... if you call the good folks at Aspen, they could help with that (ping me if you need a contact) > decision was made to reconfigure the system from having two different > head nodes running two different OS Distributions to one Head Node > controlling all 64 Compute Nodes. In addition SUSE Linux Enterprise > Server (10SP2) (X86-64) was selected as the OS for all of the nodes. Ok. > Now on to my question which will more then likely be the first of many. > In the collective group wisdom what would be the most efficient & Danger Will Robinson ... for the N people who answer, you are likely to get N+2 answers, and N/2 arguments going ... not a bad thing, but to steal from the Perl motto "there is more than one way to do these things ..." > effective way to "push" the SLES OS out to all of the compute nodes once > it is fully installed & configured on the Head Node. In my research First: Stateless (e.g. diskless) versus Stateful (e.g. local installation). Scyld is "stateless" though Don will likely correct me (as this is massively oversimpilfied). SuSE can be installed Stateless or Stateful. Its installation can be automated ... we have been doing this for years (one of the few vendors to have done this with SuSE). It can also be run diskless ... we have booted compute nodes with Infiniband to fully operational compute nodes visible in all aspects within the cluster in under 60 seconds. This is the case for 9.3, 10.x SuSE flavors. > I've read about various Cluster packages/distributions that have that > capability built in, such as ROCKS & OSCAR which appear to have the > innate capability to do this as well as some additional tools that would > be very nice to use in managing the system. However, from my current > research in appears that they do not support SLES 10sp2 for the AMD Rocks only supports Redhat and rebuilds, I wouldn't recommend it for the task as you have indicated. Oscar might be able to handle this, though I haven't kept up on it, so I am not sure how active it is. You want to look at xCat v2 (open source), and Warewulf/Perceus (open source). Our package (Tiburon) is not ready to be released, and we will likely make it a meta package atop Perceus at some point soon. Though it is used in production at several large commercial companies specifically for SuSE clusters. > 64-bit Architecture (although since I am so new at this I could be > wrong). Are there any other "free" (money is always an issue) products > or methodologies I should be looking at to push the OS out & help me > manage the system? It appears that a commercial product Moab Cluster See above. If you want a prepackaged system, likely you are going to need to spend money. Moab is a possibility, though for SuSE, I would recommend looking at Concurrent Thinking's appliance. It will cost money, but they solve pretty much all of the problems for you. > Builder will do everything I need & more, but I do not have the funds to > purchase a solution. I also certainly do not want to perform a manual > OS install on all 64 Compute Nodes. No... in all likelihood, you really don't want to do any installation to the nodes (stateless if possible). > > > > Thanks in advance for any & all help, advice, guidance, or pearls of > wisdom that you can provide this Neophyte. Oh and please don't ask why > SLES 10sp2, I've already been through that one with management. It is > what I have been provided & will make work. It's not an issue, though we recommend better kernels/kernel updates. Compared to the RHEL kernels, it uses modern stuff. Joe > > > > > > ** Steven A. Herborn ** > > * * U.S. * * ** Naval Academy ** > > ** Advanced Research Computing ** > > ** 410-293-6480 (Desk) ** > > ** 757-418-0505 (Cell) **** ** > > > ------------------------------------------------------------------------ > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: [EMAIL PROTECTED] web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf