Hi Charles
I've been looking for time to answer this more fully than the quick
oneshot mail I send off earlier.
We run 600 servers on gentoo and started with a single server originally.
So yes it's definitely doable :)
I'll try and answer as much as possible, any questions feel free to mail.
Charles Duffy wrote:
I'm looking at replacing SuSE SLES9 with Gentoo for an enterprise
application (for reasons of flexibility and licensing) (no, we don't
have an enterprise application budget -- just the reliability
requirements; yaaay, startups!). We're looking to be able to deploy and
manage hundreds of geographically distributed servers.
See above, what is your planned initial deployment ? Are you starting
with a hundred or more servers or are you starting with just a couple ?
We have a QA department available to vet each configuration before it is
deployed to the field. We have infrastructure for tracking the progress
of code in svn from creation though QA to deployment; I'm anticipating
tracking a local overlay (containing all packages we use), make.conf,
/etc/portage/*, etc. through this system, autobuilding system images
(either to run virtualized or on real hardware) from the contents of
svn, building binary packages and deploying them to real hardware.
Having a QA department to offload this work to is certainly a bonus :)
I'm interested in best practices, suggested tools, and/or 3rd party
experiences in this regard.
Some particular questions which come to mind:
- Should I be using a custom profile or a standard profile with
overrides through make.conf, /etc/portage/* and the like?
AFAIK you should be able to set all required stuff through overrides.
The point is to keep in mind the benefits of using gentoo and not try
and work against the system.
Several people discussed running gentoo servers in produktion without a
build toolchain (gcc etc.)
I have no comments to offer on how desirable this is, but if this is a
goal/requirement for you I'd strongly suggest using a binary distro.
Gentoo shines as a ultra-configurable source-based distro, running it
without a build toolchain and / or portage tree is certainly possible.
It would however take away much of the advantages of using gentoo, so
why not switch to something else in that case.
Removing the portage tree has always been a weird question to me, nobody
discusses removing the rpm-package database, why are people so keen on
removing the portage tree ?
It takes roughly 550Mb of space which is quite a lot, but hardly a
killing requirement given todays diskspace and hardware.
The linux kernel source tree is roughly in the same sizing category.
- What's the Right Way to create new system images ready to be loaded
onto a hard drive or run through a virtual machine? gentoo-buildhoster
looks interesting. I've seen Catalyst mentioned as a way to create
stage3 images, but what documentation I've been able to find doesn't
seem very much targeted for my use cases.
I would recommend catalyst-2. Although documentation is lacking, it
isn't that hard to setup.
You're probably looking for the stage4 target if you want to build
system images.
Rolling out gentoo on such a large scale, you need a repeatable system
image build environment.
The bonus of catalyst is that you automatically get a binary package
server in the process of generating your images.
Catalyst can be told to use a binary package cache, by carefully setting
up your catalyst environment you can easily reuse that as source for
your binpkg server.
I'm not familiar with gentoo-buildhoster, but since it's webpage [1]
lists it as no longer maintained that would be a no-go area for me.
We combined catalyst with a pxe based boot environment, the quickstart
installer [2] and puppet[3].
It allows us to provision a server within 30 minutes. That's 30 minutes
from connecting the hardware to a switch and active in production.
This requires very little manual intervention, which we consider to be a
good thing (tm)
And yes, that's concurrent, we believe it to be capable of roughly 30
servers concurrent setup and that appears to be a pxe limitation.
If at all possible, try to build your deployment system thus that you
can always easily wipe a server and reinstall.
We didn't originally and are refactoring to allow it now.
- Any experiences with puppet? With out ratio of servers to staff,
automating configuration and administration is a priority. (We already
have an internal tool written with automating the server configuration
process in mind; it has some functionality puppet doesn't, and puppet
has functionality it doesn't; in theory, I'd like to extend puppet until
our internal tool becomes unnecessary, though I'll need to understand
puppet much better before I can think too hard about that).
We are using puppet extensively. It works, although it's still rough
around the edges which is as expected from such a young project.
The gentoo provider for puppet is in it's infancy. It works, but
definitly needs work as well.
Apart from that puppet is a very versatile and powerful tool.
And most importantly it has a very active community of people around it.
They are actively exchanging recipes for server configuration, which is
useful in itself but becomes extremely useful when combined with
the new module organization in puppet.
Many problems you will face in deploying and configuring such an amount
of servers will have been solved in whole or in part by someone in the
puppet community.
I would be curious what functionality is missing from puppet right now
in your opinion ?
- Have any of 'yall been in the 100s-of-servers situation with
comparable requirements and come up with a different approach to
managing it? How did things work out?
We've grown very very fast and have tried different methods along the way.
To be honest, we're still in the process of moving most of the
serverpark under puppet control (nearly 50% done)
And I actually do not expect to find a single set of tools that cope
with all the issues that you face when deploying such an amount of servers.
Your situation might be different because you are starting with a single
app.
One vital thing that is missing from the picture is an inventory database.
You need some sort of queryable database that stores servers, location,
networking info, function, server-identifyable serial of some sort,
hardware classes, deployment status etc.
Without that you're basically lost. We use a homebrew mysql based system
with both a cli- and a webinterface.
Apart from that, you'll need lots of infrastructure:
* logservers
* monitoring
* statistics gathering
* backups
* scripts repository
* version controlled configs
* bug / issue tracking
* firewalling
* loadbalancing
* network monitoring and configuration system
* etc. etc. etc.
Thank you!
You're welcome, I would like to see more people use gentoo in
large-scale environments and am actively looking for possibilities to
exchange experiences.
Feel free to contact me if you have any questions. I also hang out at a
couple of irc-channels [4] with nickname Innocenti
Out of pure curiosity what is your staff to server ratio ?
Grtz Ramon
Senior System Administrator Hyves.nl
[1]http://badpenguins.com/source/
[2]http://dev.gentoo.org/~agaffney/quickstart.php
[3]http://puppet.reductivelabs.com/
[4]#puppet, #gentoo-server, #gentoo-cluster, #gentoo-amd64
--
[EMAIL PROTECTED] mailing list