On 09/05/2017 07:14 PM, Stu Midgley wrote: > I'm not feeling much love for puppet.
I'm pretty fond of puppet for managing clusters. We use cobbler to go from PXE boot -> installed, then puppet takes over. Some of my favorite features: * Inheritance is handy node -> node for a particular cluster -> compute node -> head node * Tags for handling users is handy, 1200 users, dozen clusters, and various other bits of infrastructure makes it really easy to manage who gets access to what. * I like the self healing aspect, defining the system state, not how to get there. That way if I need to repurpose, patch, or mistakenly make a node unique in some way the next puppet run fixes it. * Definitely helps with re-use across clusters. Makes for a higher incentive to do it right the first time. * Using facts to make decisions is really useful. Things like detecting if you are a virtual machine, or updating autofs maps if IB is down. > > On Wed, Sep 6, 2017 at 7:51 AM, Christopher Samuel <sam...@unimelb.edu.au > <mailto:sam...@unimelb.edu.au>> wrote: > > On 05/09/17 15:24, Stu Midgley wrote: > > > I am in the process of redeveloping our cluster deployment and config > > management environment and wondered what others are doing? > > xCAT here for all HPC related infrastructure. Stateful installs for > GPFS NSD servers and TSM servers, compute nodes are all statelite, so a > immutable RAMdisk image is built on the management node for the compute > cluster and then on boot they mount various items over NFS (including > the GPFS state directory). > > Nothing like your scale, of course, but it works and we know if a node > has booted a particular image it will be identical to any other node > that's set to boot the same image. > > Healthcheck scripts mark nodes offline if they don't have the current > production kernel and GPFS versions (and other checks too of course) > plus Slurm's "scontrol reboot" lets us do rolling reboots without > needing to spot when nodes have become idle. > > I've got to say I really prefer this to systems like Puppet, Salt, etc, > where you need to go and tweak an image after installation. > > For our VM infrastructure (web servers, etc) we do use Salt for that. We > used to use Puppet but we switched when the only person who understood > it left. Don't miss it at all... > > cheers, > Chris > -- > Christopher Samuel Senior Systems Administrator > Melbourne Bioinformatics - The University of Melbourne > Email: sam...@unimelb.edu.au <mailto:sam...@unimelb.edu.au> Phone: +61 > (0)3 > 903 55545 <tel:%2B61%20%280%293%20903%2055545> > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org <mailto:Beowulf@beowulf.org> > sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > <http://www.beowulf.org/mailman/listinfo/beowulf> > > > > > -- > Dr Stuart Midgley > sdm...@gmail.com <mailto:sdm...@gmail.com> > > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf