Re: [ClusterLabs] Antw: stonith continues to reboot server once fencing occurs

Dickerson, Charles {Chuck} (JSC-EG)[Jacobs Technology, Inc.] Fri, 11 May 2018 10:02:42 -0700

I have attached the /var/log/cluster/corosync.log here.

The fenced node continues to be rebooted even after the stonith timeout.  The 
only way I have of stopping the reboot cycle is to completely stop the cluster 
on the remaining node.


Stonith should be able to detect that the fenced node was successfully rebooted 
and stop trying to fence it.  I have done this using both the cycle method and 
the onoff method, both methods have the same result.

Chuck Dickerson
Jacobs
JSC - EG3
(281) 244-5895

-----Original Message-----
From: Users [mailto:[email protected]] On Behalf Of Ulrich Windl
Sent: Friday, May 11, 2018 8:47 AM
To: [email protected]
Subject: [ClusterLabs] Antw: stonith continues to reboot server once fencing 
occurs

Hi!

Could it be that the node reboots faster than the stonith timeout? So the node 
will unexpectedly come up...

Without logs it's hard to say.

Regards,
Ulrich

>>> "Dickerson, Charles Chuck (JSC-EG)[Jacobs Technology, Inc.]"
<[email protected]> schrieb am 11.05.2018 um 15:32 in Nachricht
<[email protected]>:
> I have a 2 node cluster, once fencing occurs, the fenced node is 
> continually

> rebooted every time it comes up.
> 
> Configuration:  2 identical nodes ‑ Centos 7.4, pacemaker 1.1.18, pcs 
> 0.9.162, fencing configured using fence_ipmilan The cluster is set to 
> ignore quorum and stonith is enabled.  Firewalld has been disabled.
> 
> I can manually issue the fence_ipmilan command and the specified node 
> is rebooted, comes back up and fence_ipmilan sees this and reports success.
> 
> If fencing is initiated via the "pcs stonith fence" command, 
> stonith_admin command, or by disrupting the communication between the 
> nodes, the proper node is rebooted, but the stonith_admin command 
> times out and never sees the

> node as rebooted.  The node is then rebooted every time it comes back 
> up on

> the network.  The status remains UNCLEAN in pcs status.
> 
> Chuck Dickerson
> Jacobs
> JSC ‑ EG3
> (281) 244‑5895



_______________________________________________
Users mailing list: [email protected] 
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

ï»¿May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
handle_request:       Client stonith_admin.11395.c4ac81ce wants to fence 
(reboot) 'vmhost1-fsl.bcn' with device '(any)'
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
initiate_remote_stonith_op:      Requesting peer fencing (reboot) of 
vmhost1-fsl.bcn | id=11676486-cc58-4ac5-baae-f015cd660998 state=0
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
can_fence_host_with_device:      fence_vmhost0_ipmi can not fence (reboot) 
vmhost1-fsl.bcn: static-list
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
can_fence_host_with_device:      fence_vmhost1_ipmi can fence (reboot) 
vmhost1-fsl.bcn: static-list
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
process_remote_stonith_query:    Query result 1 of 2 from vmhost0-fsl.bcn for 
vmhost1-fsl.bcn/reboot (1 devices) 11676486-cc58-4ac5-baae-f015cd660998
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
process_remote_stonith_query:    Query result 2 of 2 from vmhost1-fsl.bcn for 
vmhost1-fsl.bcn/reboot (1 devices) 11676486-cc58-4ac5-baae-f015cd660998
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
process_remote_stonith_query:    All query replies have arrived, continuing (2 
expected/2 received) 
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
call_remote_stonith:     Total timeout set to 120 for peer's fencing of 
vmhost1-fsl.bcn for stonith_admin.11395|id=11676486-cc58-4ac5-baae-f015cd660998
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
call_remote_stonith:     Requesting that 'vmhost0-fsl.bcn' perform op 
'vmhost1-fsl.bcn reboot' for stonith_admin.11395 (144s, 0s)
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
can_fence_host_with_device:      fence_vmhost0_ipmi can not fence (reboot) 
vmhost1-fsl.bcn: static-list
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
can_fence_host_with_device:      fence_vmhost1_ipmi can fence (reboot) 
vmhost1-fsl.bcn: static-list
May 10 15:25:09 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
stonith_fence_get_devices_cb:    Found 1 matching devices for 'vmhost1-fsl.bcn'
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
crm_update_peer_expected:        handle_request: Node vmhost1-fsl.bcn[2] - 
expected state is now down (was member)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
handle_shutdown_request: Creating shutdown request for vmhost1-fsl.bcn 
(state=S_IDLE)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
update_attrd_helper:     Connecting to attribute manager ... 5 retries remaining
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:     info: 
attrd_peer_update:       Setting shutdown[vmhost1-fsl.bcn]: (null) -> 
1525983913 from vmhost0-fsl.bcn
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: --- 0.13.20 2
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: +++ 0.13.21 (null)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib:  @num_updates=21
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  ++ /cib/status/node_state[@id='2']:  <transient_attributes 
id="2"/>
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  ++                                     <instance_attributes 
id="status-2">
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  ++                                       <nvpair 
id="status-2-shutdown" name="shutdown" value="1525983913"/>
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  ++                                     </instance_attributes>
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  ++                                   </transient_attributes>
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_modify operation for section status: OK 
(rc=0, origin=vmhost1-fsl.bcn/attrd/4, version=0.13.21)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
abort_transition_graph:  Transition aborted by transient_attributes.2 'create': 
Transient attribute change | cib=0.13.21 source=abort_unless_down:343 
path=/cib/status/node_state[@id='2'] complete=true
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
do_state_transition:     State transition S_IDLE -> S_POLICY_ENGINE | 
input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:   notice: 
unpack_config:   On loss of CCM Quorum: Ignore
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
determine_online_status_fencing: Node vmhost0-fsl.bcn is active
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
determine_online_status: Node vmhost0-fsl.bcn is online
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
determine_online_status: Node vmhost1-fsl.bcn is shutting down
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
unpack_node_loop:        Node 1 is already processed
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
unpack_node_loop:        Node 2 is already processed
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
unpack_node_loop:        Node 1 is already processed
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
unpack_node_loop:        Node 2 is already processed
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
common_print:    fence_vmhost0_ipmi      (stonith:fence_ipmilan):        
Started vmhost0-fsl.bcn
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
common_print:    fence_vmhost1_ipmi      (stonith:fence_ipmilan):        
Started vmhost1-fsl.bcn
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
RecurringOp:      Start recurring monitor (60s) for fence_vmhost1_ipmi on 
vmhost0-fsl.bcn
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:   notice: stage6:  
Scheduling Node vmhost1-fsl.bcn for shutdown
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:   notice: 
LogNodeActions:   * Shutdown vmhost1-fsl.bcn
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
LogActions:      Leave   fence_vmhost0_ipmi      (Started vmhost0-fsl.bcn)
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:   notice: 
LogAction:        * Move       fence_vmhost1_ipmi     ( vmhost1-fsl.bcn -> 
vmhost0-fsl.bcn )  
May 10 15:25:13 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:   notice: 
process_pe_message:      Calculated transition 1, saving inputs in 
/var/lib/pacemaker/pengine/pe-input-24.bz2
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
do_state_transition:     State transition S_POLICY_ENGINE -> 
S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE 
origin=handle_response
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
do_te_invoke:    Processing graph 1 (ref=pe_calc-dc-1525983913-22) derived from 
/var/lib/pacemaker/pengine/pe-input-24.bz2
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
te_rsc_command:  Initiating stop operation fence_vmhost1_ipmi_stop_0 on 
vmhost1-fsl.bcn | action 6
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: --- 0.13.21 2
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: +++ 0.13.22 (null)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib:  @num_updates=22
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  
/cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='fence_vmhost1_ipmi']/lrm_rsc_op[@id='fence_vmhost1_ipmi_last_0']:
  @operation_key=fence_vmhost1_ipmi_stop_0, @operation=stop, 
@transition-key=6:1:0:15826900-16df-4cd4-9f7d-7213b59d15a6, 
@transition-magic=-1:193;6:1:0:15826900-16df-4cd4-9f7d-7213b59d15a6, 
@call-id=-1, @rc-code=193, @op-status=-1, @last-run=1525983913, 
@last-rc-change=152598391
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_modify operation for section status: OK 
(rc=0, origin=vmhost1-fsl.bcn/crmd/18, version=0.13.22)
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
update_cib_stonith_devices_v2:   Updating device list from the cib: modify 
lrm_rsc_op[@id='fence_vmhost1_ipmi_last_0']
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
cib_devices_update:      Updating devices to version 0.13.22
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
unpack_config:   On loss of CCM Quorum: Ignore
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: --- 0.13.22 2
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: +++ 0.13.23 (null)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib:  @num_updates=23
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  
/cib/status/node_state[@id='2']/lrm[@id='2']/lrm_resources/lrm_resource[@id='fence_vmhost1_ipmi']/lrm_rsc_op[@id='fence_vmhost1_ipmi_last_0']:
  @transition-magic=0:0;6:1:0:15826900-16df-4cd4-9f7d-7213b59d15a6, 
@call-id=13, @rc-code=0, @op-status=0
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_modify operation for section status: OK 
(rc=0, origin=vmhost1-fsl.bcn/crmd/19, version=0.13.23)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
match_graph_event:       Action fence_vmhost1_ipmi_stop_0 (6) confirmed on 
vmhost1-fsl.bcn (rc=0)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
te_rsc_command:  Initiating start operation fence_vmhost1_ipmi_start_0 locally 
on vmhost0-fsl.bcn | action 7
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
do_lrm_rsc_op:   Performing key=7:1:0:15826900-16df-4cd4-9f7d-7213b59d15a6 
op=fence_vmhost1_ipmi_start_0
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section status to 
all (origin=local/crmd/42)
May 10 15:25:13 [11141] vmhost0-fsl.jsc.nasa.gov       lrmd:     info: 
log_execute:     executing - rsc:fence_vmhost1_ipmi action:start call_id:12
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
te_crm_command:  Executing crm-event (10): do_shutdown on vmhost1-fsl.bcn
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
update_cib_stonith_devices_v2:   Updating device list from the cib: modify 
lrm_rsc_op[@id='fence_vmhost1_ipmi_last_0']
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
cib_devices_update:      Updating devices to version 0.13.23
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: --- 0.13.23 2
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: +++ 0.13.24 (null)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib:  @num_updates=24
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  
/cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='fence_vmhost1_ipmi']/lrm_rsc_op[@id='fence_vmhost1_ipmi_last_0']:
  @operation_key=fence_vmhost1_ipmi_start_0, @operation=start, 
@transition-key=7:1:0:15826900-16df-4cd4-9f7d-7213b59d15a6, 
@transition-magic=-1:193;7:1:0:15826900-16df-4cd4-9f7d-7213b59d15a6, 
@call-id=-1, @rc-code=193, @op-status=-1, @last-run=1525983913, 
@last-rc-change=1525983
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
unpack_config:   On loss of CCM Quorum: Ignore
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_modify operation for section status: OK 
(rc=0, origin=vmhost0-fsl.bcn/crmd/42, version=0.13.24)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
pcmk_cpg_membership:     Node 2 left group crmd (peer=vmhost1-fsl.bcn, 
counter=2.0)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
crm_update_peer_proc:    pcmk_cpg_membership: Node vmhost1-fsl.bcn[2] - 
corosync-cpg is now offline
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
peer_update_callback:    Client vmhost1-fsl.bcn/peer now has status [offline] 
(DC=true, changed=4000000)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
peer_update_callback:    Peer vmhost1-fsl.bcn left us
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
erase_status_tag:        Deleting transient_attributes status entries for 
vmhost1-fsl.bcn | 
xpath=//node_state[@uname='vmhost1-fsl.bcn']/transient_attributes
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_delete operation for section 
//node_state[@uname='vmhost1-fsl.bcn']/transient_attributes to all 
(origin=local/crmd/43)
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
peer_update_callback:    do_shutdown of peer vmhost1-fsl.bcn is complete | op=10
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
crm_update_peer_join:    crmd_peer_down: Node vmhost1-fsl.bcn[2] - join-2 phase 
confirmed -> none
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
pcmk_cpg_membership:     Node 1 still member of group crmd 
(peer=vmhost0-fsl.bcn, counter=2.0)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section status to 
all (origin=local/crmd/44)
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
update_cib_stonith_devices_v2:   Updating device list from the cib: modify 
lrm_rsc_op[@id='fence_vmhost1_ipmi_last_0']
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
cib_devices_update:      Updating devices to version 0.13.24
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
unpack_config:   On loss of CCM Quorum: Ignore
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: --- 0.13.24 2
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: +++ 0.13.25 bb7eb20c2878d573f8b81b8aed19e68f
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  -- 
/cib/status/node_state[@id='2']/transient_attributes[@id='2']
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib:  @num_updates=25
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_delete operation for section 
//node_state[@uname='vmhost1-fsl.bcn']/transient_attributes: OK (rc=0, 
origin=vmhost0-fsl.bcn/crmd/43, version=0.13.25)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: --- 0.13.25 2
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: +++ 0.13.26 (null)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib:  @num_updates=26
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib/status/node_state[@id='2']:  @crmd=offline, 
@crm-debug-origin=peer_update_callback, @join=down, @expected=down
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_modify operation for section status: OK 
(rc=0, origin=vmhost0-fsl.bcn/crmd/44, version=0.13.26)
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:     info: 
pcmk_cpg_membership:     Node 2 left group attrd (peer=vmhost1-fsl.bcn, 
counter=2.0)
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:     info: 
crm_update_peer_proc:    pcmk_cpg_membership: Node vmhost1-fsl.bcn[2] - 
corosync-cpg is now offline
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:   notice: 
crm_update_peer_state_iter:      Node vmhost1-fsl.bcn state is now lost | 
nodeid=2 previous=member source=crm_update_peer_proc
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:   notice: 
attrd_peer_remove:       Removing all vmhost1-fsl.bcn attributes for peer loss
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:   notice: 
attrd_peer_change_cb:    Lost attribute writer vmhost1-fsl.bcn
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:     info: 
crm_reap_dead_member:    Removing node with name vmhost1-fsl.bcn and id 2 from 
membership cache
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:   notice: 
reap_crm_member: Purged 1 peer with id=2 and/or uname=vmhost1-fsl.bcn from the 
membership cache
May 10 15:25:13 [11142] vmhost0-fsl.jsc.nasa.gov      attrd:     info: 
pcmk_cpg_membership:     Node 1 still member of group attrd 
(peer=vmhost0-fsl.bcn, counter=2.0)
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
pcmk_cpg_membership:     Node 2 left group stonith-ng (peer=vmhost1-fsl.bcn, 
counter=2.0)
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
crm_update_peer_proc:    pcmk_cpg_membership: Node vmhost1-fsl.bcn[2] - 
corosync-cpg is now offline
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_shutdown_req:        Shutdown REQ from vmhost1-fsl.bcn
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
crm_update_peer_state_iter:      Node vmhost1-fsl.bcn state is now lost | 
nodeid=2 previous=member source=crm_update_peer_proc
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
crm_reap_dead_member:    Removing node with name vmhost1-fsl.bcn and id 2 from 
membership cache
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
reap_crm_member: Purged 1 peer with id=2 and/or uname=vmhost1-fsl.bcn from the 
membership cache
May 10 15:25:13 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:     info: 
pcmk_cpg_membership:     Node 1 still member of group stonith-ng 
(peer=vmhost0-fsl.bcn, counter=2.0)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
pcmk_cpg_membership:     Node 2 left group cib (peer=vmhost1-fsl.bcn, 
counter=2.0)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
crm_update_peer_proc:    pcmk_cpg_membership: Node vmhost1-fsl.bcn[2] - 
corosync-cpg is now offline
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:   notice: 
crm_update_peer_state_iter:      Node vmhost1-fsl.bcn state is now lost | 
nodeid=2 previous=member source=crm_update_peer_proc
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
crm_reap_dead_member:    Removing node with name vmhost1-fsl.bcn and id 2 from 
membership cache
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:   notice: 
reap_crm_member: Purged 1 peer with id=2 and/or uname=vmhost1-fsl.bcn from the 
membership cache
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
pcmk_cpg_membership:     Node 1 still member of group cib 
(peer=vmhost0-fsl.bcn, counter=2.0)
May 10 15:25:13 [11138] vmhost0-fsl.jsc.nasa.gov pacemakerd:     info: 
pcmk_cpg_membership:     Node 2 left group pacemakerd (peer=vmhost1-fsl.bcn, 
counter=2.0)
May 10 15:25:13 [11138] vmhost0-fsl.jsc.nasa.gov pacemakerd:     info: 
crm_update_peer_proc:    pcmk_cpg_membership: Node vmhost1-fsl.bcn[2] - 
corosync-cpg is now offline
May 10 15:25:13 [11138] vmhost0-fsl.jsc.nasa.gov pacemakerd:     info: 
pcmk_cpg_membership:     Node 1 still member of group pacemakerd 
(peer=vmhost0-fsl.bcn, counter=2.0)
May 10 15:25:13 [11138] vmhost0-fsl.jsc.nasa.gov pacemakerd:     info: 
mcp_cpg_deliver: Ignoring process list sent by peer for local node
[11130] vmhost0-fsl.jsc.nasa.gov corosyncnotice  [TOTEM ] A new membership 
(192.168.1.140:184) was formed. Members left: 2
[11130] vmhost0-fsl.jsc.nasa.gov corosyncnotice  [QUORUM] Members[1]: 1
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
pcmk_quorum_notification:        Quorum retained | membership=184 members=1
[11130] vmhost0-fsl.jsc.nasa.gov corosyncnotice  [MAIN  ] Completed service 
synchronization, ready to provide service.
May 10 15:25:13 [11138] vmhost0-fsl.jsc.nasa.gov pacemakerd:     info: 
pcmk_quorum_notification:        Quorum retained | membership=184 members=1
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
crm_update_peer_state_iter:      Node vmhost1-fsl.bcn state is now lost | 
nodeid=2 previous=member source=crm_reap_unseen_nodes
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
peer_update_callback:    Cluster node vmhost1-fsl.bcn is now lost (was member)
May 10 15:25:13 [11138] vmhost0-fsl.jsc.nasa.gov pacemakerd:   notice: 
crm_update_peer_state_iter:      Node vmhost1-fsl.bcn state is now lost | 
nodeid=2 previous=member source=crm_reap_unseen_nodes
May 10 15:25:13 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
peer_update_callback:    do_shutdown of peer vmhost1-fsl.bcn is complete | op=10
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section status to 
all (origin=local/crmd/45)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section nodes to 
all (origin=local/crmd/48)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section status to 
all (origin=local/crmd/49)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_modify operation for section status: OK 
(rc=0, origin=vmhost0-fsl.bcn/crmd/45, version=0.13.26)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_modify operation for section nodes: OK 
(rc=0, origin=vmhost0-fsl.bcn/crmd/48, version=0.13.26)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: --- 0.13.26 2
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  Diff: +++ 0.13.27 (null)
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib:  @num_updates=27
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib/status/node_state[@id='1']:  
@crm-debug-origin=post_cache_update
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_perform_op:  +  /cib/status/node_state[@id='2']:  @in_ccm=false, 
@crm-debug-origin=post_cache_update
May 10 15:25:13 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Completed cib_modify operation for section status: OK 
(rc=0, origin=vmhost0-fsl.bcn/crmd/49, version=0.13.27)
May 10 15:25:18 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_ping:        Reporting our current digest to vmhost0-fsl.bcn: 
06babcaef13cce59fbca2c77f46f92cd for 0.13.27 (0x56142f670c50 0)
[11130] vmhost0-fsl.jsc.nasa.gov corosyncnotice  [TOTEM ] The network interface 
is down.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncnotice  [TOTEM ] adding new UDPU 
member {192.168.1.140}
[11130] vmhost0-fsl.jsc.nasa.gov corosyncnotice  [TOTEM ] adding new UDPU 
member {192.168.1.141}
May 10 15:25:24 [11140] vmhost0-fsl.jsc.nasa.gov stonith-ng:   notice: 
log_operation:   Operation 'reboot' [11396] (call 2 from stonith_admin.11395) 
for host 'vmhost1-fsl.bcn' with device 'fence_vmhost1_ipmi' returned: 0 (OK)
May 10 15:25:24 [11141] vmhost0-fsl.jsc.nasa.gov       lrmd:     info: 
log_finished:    finished - rsc:fence_vmhost1_ipmi action:start call_id:12  
exit-code:0 exec-time:11302ms queue-time:0ms
May 10 15:25:24 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
process_lrm_event:       Result of start operation for fence_vmhost1_ipmi on 
vmhost0-fsl.bcn: 0 (ok) | call=12 key=fence_vmhost1_ipmi_start_0 confirmed=true 
cib-update=50
May 10 15:25:24 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section status to 
all (origin=local/crmd/50)
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
May 10 15:25:54 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:  warning: 
cib_rsc_callback:        Resource update 50 failed: (rc=-62) Timer expired
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
May 10 15:26:33 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:  warning: 
action_timer_callback:   Timer popped (timeout=20000, abort_level=0, 
complete=false)
May 10 15:26:33 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:    error: 
print_synapse:   [Action    7]: In-flight rsc op fence_vmhost1_ipmi_start_0     
   on vmhost0-fsl.bcn (priority: 0, waiting: none)
May 10 15:26:33 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
abort_transition_graph:  Transition aborted: Action lost | 
source=action_timer_callback:911 complete=false
May 10 15:26:33 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:  warning: 
cib_action_update:       rsc_op 7: fence_vmhost1_ipmi_start_0 on 
vmhost0-fsl.bcn timed out
May 10 15:26:33 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
create_operation_update: cib_action_update: Updating resource 
fence_vmhost1_ipmi after start op Timed Out (interval=0)
May 10 15:26:33 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
run_graph:       Transition 1 (Complete=4, Pending=0, Fired=0, Skipped=0, 
Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-24.bz2): Complete
May 10 15:26:33 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
do_state_transition:     State transition S_TRANSITION_ENGINE -> 
S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd
May 10 15:26:33 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section status to 
all (origin=local/crmd/51)
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
May 10 15:27:04 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:    error: 
cib_action_updated:      Update 51 FAILED: Timer expired
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:   notice: 
unpack_config:   On loss of CCM Quorum: Ignore
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
determine_online_status_fencing: Node vmhost0-fsl.bcn is active
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
determine_online_status: Node vmhost0-fsl.bcn is online
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
unpack_node_loop:        Node 1 is already processed
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
unpack_node_loop:        Node 2 is already processed
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
unpack_node_loop:        Node 1 is already processed
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
unpack_node_loop:        Node 2 is already processed
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
common_print:    fence_vmhost0_ipmi      (stonith:fence_ipmilan):        
Started vmhost0-fsl.bcn
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
common_print:    fence_vmhost1_ipmi      (stonith:fence_ipmilan):        
Started vmhost0-fsl.bcn
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
RecurringOp:      Start recurring monitor (60s) for fence_vmhost1_ipmi on 
vmhost0-fsl.bcn
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
LogActions:      Leave   fence_vmhost0_ipmi      (Started vmhost0-fsl.bcn)
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:     info: 
LogActions:      Leave   fence_vmhost1_ipmi      (Started vmhost0-fsl.bcn)
May 10 15:27:04 [11143] vmhost0-fsl.jsc.nasa.gov    pengine:   notice: 
process_pe_message:      Calculated transition 2, saving inputs in 
/var/lib/pacemaker/pengine/pe-input-25.bz2
May 10 15:27:04 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
do_state_transition:     State transition S_POLICY_ENGINE -> 
S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE 
origin=handle_response
May 10 15:27:04 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
do_te_invoke:    Processing graph 2 (ref=pe_calc-dc-1525984024-27) derived from 
/var/lib/pacemaker/pengine/pe-input-25.bz2
May 10 15:27:04 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
te_rsc_command:  Initiating start operation fence_vmhost1_ipmi_start_0 locally 
on vmhost0-fsl.bcn | action 5
May 10 15:27:04 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:     info: 
do_lrm_rsc_op:   Performing key=5:2:0:15826900-16df-4cd4-9f7d-7213b59d15a6 
op=fence_vmhost1_ipmi_start_0
May 10 15:27:04 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section status to 
all (origin=local/crmd/83)
May 10 15:27:04 [11141] vmhost0-fsl.jsc.nasa.gov       lrmd:     info: 
log_execute:     executing - rsc:fence_vmhost1_ipmi action:start call_id:13
May 10 15:27:04 [11141] vmhost0-fsl.jsc.nasa.gov       lrmd:     info: 
log_finished:    finished - rsc:fence_vmhost1_ipmi action:start call_id:13  
exit-code:0 exec-time:112ms queue-time:0ms
May 10 15:27:04 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:   notice: 
process_lrm_event:       Result of start operation for fence_vmhost1_ipmi on 
vmhost0-fsl.bcn: 0 (ok) | call=13 key=fence_vmhost1_ipmi_start_0 confirmed=true 
cib-update=84
May 10 15:27:04 [11139] vmhost0-fsl.jsc.nasa.gov        cib:     info: 
cib_process_request:     Forwarding cib_modify operation for section status to 
all (origin=local/crmd/84)
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
May 10 15:27:34 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:  warning: 
cib_rsc_callback:        Resource update 83 failed: (rc=-62) Timer expired
May 10 15:27:34 [11144] vmhost0-fsl.jsc.nasa.gov       crmd:  warning: 
cib_rsc_callback:        Resource update 84 failed: (rc=-62) Timer expired
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.
[11130] vmhost0-fsl.jsc.nasa.gov corosyncwarning [MAIN  ] Totem is unable to 
form a cluster because of an operating system or network fault. The most common 
cause of this message is that the local firewall is configured improperly.

_______________________________________________
Users mailing list: [email protected]
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: stonith continues to reboot server once fencing occurs

Reply via email to