Hey Team,
I'm receiving some strange intermittent failovers on a two-node cluster
(happens once every week or two). When this happens, both nodes are
unavailable; one node will be marked offline and the other will be shown as
unclean. Any help on this would be massively appreciated. Thanks.
Running Ubuntu 12.04 (64-bit)
Pacemaker 1.1.6-2ubuntu3.3
Corosync 1.4.2-2ubuntu0.2
Here are the logs:
Nov 08 14:26:26 corosync [pcmk ] info: pcmk_ipc_exit: Client crmd
(conn=0x12bebe0, async-conn=0x12bebe0) left
Nov 08 14:26:26 corosync [pcmk ] WARN: route_ais_message: Sending message to
local.crmd failed: ipc delivery failed (rc=-2)
Nov 08 14:26:27 corosync [pcmk ] info: pcmk_ipc_exit: Client attrd
(conn=0x12d0230, async-conn=0x12d0230) left
Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc_exit: Client cib
(conn=0x12c7d80, async-conn=0x12c7d80) left
Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc_exit: Client stonith-ng
(conn=0x12c3a20, async-conn=0x12c3a20) left
Nov 08 14:26:32 corosync [pcmk ] WARN: route_ais_message: Sending message to
local.crmd failed: ipc delivery failed (rc=-2)
Nov 08 14:26:32 corosync [pcmk ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)
Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12bebe0
for stonith-ng/0
Nov 08 14:26:32 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12c2f40
for attrd/0
Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12c72a0
for cib/0
Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Sending membership update 12
to cib
Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x12cb600
for crmd/0
Nov 08 14:26:33 corosync [pcmk ] info: pcmk_ipc: Sending membership update 12
to crmd
Output of crm configure show:
node p-sbc3 \
attributes standby="off"
node p-sbc4 \
attributes standby="off"
primitive fs lsb:FSSofia \
op monitor interval="2s" enabled="true" timeout="10s" on-fail="standby"
\
meta target-role="Started"
primitive fs-ip ocf:heartbeat:IPaddr2 \
params ip="10.100.0.90" nic="eth0:0" cidr_netmask="24" \
op monitor interval="10s"
primitive fs-ip2 ocf:heartbeat:IPaddr2 \
params ip="10.100.0.99" nic="eth0:1" cidr_netmask="24" \
op monitor interval="10s"
group cluster_services fs-ip fs-ip2 fs \
meta target-role="Started"
property $id="cib-bootstrap-options" \
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
last-lrm-refresh="1348755080" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
_______________________________________________
Pacemaker mailing list: [email protected]
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org