Re: [Pacemaker] pacemaker/stonith running "amok"


On Nov 5, 2008, at 6:33 PM, Raoul Bhatia [IPAX] wrote:

hi,

first off, please find the hb_report at [1].

what i did to my 2 node cluster (wc01, wc02)

wc02# crm_standby -l reboot -N wc01 -v true

i verified that wc01 was in standby and (at least i think) theresources

have been migrated off from wc01.

wc01# apt-get -u dist-upgrade

upgraded apache2

wc01# sync;sync;reboot

rebootet wc01 as i thought "-l reboot" will make wc01 rejoin after the
reboot.

wc01 came up but was still considered in standby mode. all of asudden,

the cluster continuously rebooted wc02 until i finally moved wc01
out of standbymode with:

#wc01: crm_standby -v off -N wc01 -l reboot


can any1 please explain what i did wrong?


The logs don't go back far enough to say.

At 18:18:05 the PE is invoked and sees that wc02 is failed and startsto shoot it - but there is no record of it leaving the ccm.

Then all the stonith commands fail - you might want to check the script.

But there is no record at all of wc01 rebooting or wc02's reactionwhen it returns.


_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to