A few comments, below:
Leslie Rhorer wrote:
I need a little (or maybe more than a little) advice and guidance on setting up
a High Availablity cluster on some Debian machines. I've read through the man
pages and the config files, but I'm falling short of understanding everything I
need to do. I am still in the process of obtaining all the software, so I
don't yet have the full process plan laid out, let alone all configured, but I
know I already need some help, so thanks in advance.
First of all, let me outline the situation. I've written an HVAC control suite
that works with a number of wireless thermostats to control the air handlers in
my house. I have one device which monitors the status lines from the
thermostats, and a second device that controls relays that open and close the
air vents as needed and turns on or off the appropriate air handlers as needed.
Each of those devices have an IP address and are controlled by a simple binary
( c program ). The binary sits in memory and polls the contact monitor every
couple of seconds and then using the data obtained from the monitor writes the
data to the relays and outputs the information to a couple of small data files
for use by some scripts that provide CL and Web status of the systems. One of
these scripts has to run once a minute or so in order to maintain a historical
record of how much time each unit spends running. I have the Web data online at
http://fletchergeek.homelinux.net
for anyone who would like to see.
Right now I have the binary and the CL scripts running on a little Raspbery Pi
(hostname Thermostat), but I don't want to have my Air Conditioning fail if the
little RPi is down or for whatever reason not talking to either of the two
terminal devices. I have two servers (hostnames RAID-Server and Backup),
either of which can take over in that event. If one or both of the terminal
devices are unavailable to all three servers, then I want to be alerted to the
fact. By my understanding so far, this means the three servers need to be set
up as cluster members and the two terminal devices set up as pseudo-cluster
ping devices. I have a set of scripts that restart the binary if it hangs and
reports to me if it has to restart the binary more than 5 times in a row, but I
believe all that can be handled by the cluster manager (or is it directly
handled by Heartbeat?). Right now a cron job handles running the data
collection correlation once a minute, but I take it that function will have to
be taken over by a script that runs continuously on the active cluster node.
That's about as far as I have gotten though.
Both big machines are running Debian Jessie, while the RPi is running Raspbian,
a Wheezy derivative. To my understanding, Pacemaker is the best cluster
management system for this purpose, but evidently one of the libraries used by
Pacemaker did not make it to the Jessie distro, so the entire package has been
removed from the distro. Unless something has changed in the last couple of
months, evidently I am going to have to compile from source on those to
machines. Pacemaker should be available on the RPi, but it will no doubt be a
different version than that running on the Jessie machines. Will that create
issues?
My familiarity with pacemaker is in the more common cluster setup -
mirroring virtual machine stack (in my case, Xen), and it's associated
disks (using DRBD) - for automatic failover of entire virtual machines.
Haven't tried using pacemaker by itself for application level failover.
One way to set things up would be simply to set up mirrored virtual
machines, and let failover be handled at that level (you could use
pacemaker or the Remus funcationlity of later Xen implementations).
You might also look at pure application layer redundancy - pacemaker
might or might not help you - you might be better off just running two
copies of your scripts with some basic synchronization and
primary/secondary logic. (Personally, I'd do this in Erlang, which
makes this kind of application rather trivial).
For more pacemaker help, check out the resources at
http://clusterlabs.org/, and maybe pose this query on the linux-ha email
list.
Hopes this helps,
Miles Fidelman
--
In theory, there is no difference between theory and practice.
In practice, there is. .... Yogi Berra
--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/55bf9859.8000...@meetinghouse.net