On Tue, Jul 8, 2014, at 02:59, Andrew Beekhof wrote:
>
> On 4 Jul 2014, at 3:16 pm, Giuseppe Ragusa <[email protected]>
> wrote:
>
> > Hi all,
> > I'm trying to create a script as per subject (on CentOS 6.5,
> > CMAN+Pacemaker, only DRBD+KVM active/passive resources; SNMP-UPS monitored
> > by NUT).
> >
> > Ideally I think that each node should stop (disable) all locally-running
> > VirtualDomain resources (doing so cleanly demotes than downs the DRBD
> > resources underneath), then put itself in standby and finally shutdown.
>
> Since the end goal is shutdown, why not just run 'pcs cluster stop' ?
I thought that this action would cause communication interruption (since
Corosync would be not responding to the peer) and so cause the other node to
stonith us; I know that ideally the other node too should perform "pcs cluster
stop" in short, since the same UPS powers both, but I worry about timing issues
(and "races") in UPS monitoring since it is a large Enterprise UPS monitored by
SNMP.
Furthermore I do not know what happens to running resources at "pcs cluster
stop": I infer from your suggestion that resources are brought down and not
migrated on the other node, correct?
> Possibly with 'pcs cluster standby' first if you're worried that stopping the
> resources might take too long.
I thought that "pcs cluster standby" would usually migrate the resources to the
other node (I actually tried it and confirmed the expected behaviour); so this
would risk to become a race with the timing of the other node standby, so this
is why I took the hassle of explicitly and orderly stopping all locally-running
resources in my script BEFORE putting the local node in standby.
> Pacemaker will stop everything in the required order and stop the node when
> done... problem solved?
I thought that after a "pcs cluster standby" a regular "shutdown -h" of the
operating system would cleanly bring down the cluster too, without the need for
a "pcs cluster stop", given that both Pacemaker and CMAN are correctly
configured for automatic startup/shutdown as operating system services (SysV
initscripts controlled by CentOS 6.5 Upstart, in my case).
Many thanks again for your always thought-provoking and informative answers!
Regards,
Giuseppe
> >
> > On further startup, manual intervention would be required to unstandby all
> > nodes and enable resources (nodes already in standby and resources already
> > disabled before blackout should be manually distinguished).
> >
> > Is this strategy conceptually safe?
> >
> > Unfortunately, various searches have turned out no "prior art" :)
> >
> > This is my tentative script (consider it in the public domain):
> >
> > ------------------------------------------------------------------------------------------------------------------------------------
> > #!/bin/bash
> >
> > # Note: "pcs cluster status" still has a small bug vs. CMAN-controlled
> > Corosync and would always return != 0
> > pcs status > /dev/null 2>&1
> > STATUS=$?
> >
> > # Detect if cluster is running at all on local node
> > # TODO: detect node already in standby and bypass this
> > if [ "${STATUS}" = 0 ]; then
> > local_node="$(cman_tool status | grep -i 'Node[[:space:]]*name:' | sed
> > -e 's/^.*Node\s*name:\s*\([^[:space:]]*\).*$/\1/i')"
> > for local_resource in $(pcs status 2>/dev/null | grep
> > "ocf::heartbeat:VirtualDomain.*${local_node}\\s*\$" | awk '{print $1}'); do
> > pcs resource disable "${local_resource}"
> > done
> > # TODO: each resource disabling above may return without waiting for
> > complete stop - wait here for "no more resources active"? (but avoid
> > endless loops)
> > pcs cluster standby "${local_node}"
> > fi
> >
> > # Shut down gracefully anyway at the end
> > /sbin/shutdown -h +0
> >
> > ------------------------------------------------------------------------------------------------------------------------------------
> >
> > Comments/suggestions/improvements are more than welcome.
> >
> > Many thanks in advance.
> >
> > Regards,
> > Giuseppe
> >
> > _______________________________________________
> > Pacemaker mailing list: [email protected]
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: [email protected]
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> Email had 1 attachment:
> + signature.asc
> 1k (application/pgp-signature)
--
Giuseppe Ragusa
[email protected]
_______________________________________________
Pacemaker mailing list: [email protected]
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org