I have a 0.6 pacemaker/heartbeat cluster setup in a lab with resources as follows:
Group-lvs(ordered): two primitives -> ocf/IPddr2 and ocf/ldirectord. Clone-pingd: set to monitor a couple of Ips and used to set a weight for where to run the LVS group. -- This is the area that I have a question on -- Clone-stonith-node1: HP ILO to shoot node1 Clone-stonith-node2: HP ILO to shoot node2 I read on the old linux-ha site that using a clone for ILO/stonith was the way to go. I'm not sure I see how this would work correctly and be preferred over a standard resource. What I am confused about is this: the external/riloe stonith plugin only knows how to shoot one node so why would you want to run it as a clone since each external/riloe is configured differently. I went ahead and configured the riloe's as clones feeling that the docs are correct and that the reason would become obvious to me later. (I also saw a similar post with no response: http://www.gossamer-threads.com/lists/linuxha/users/35685?nohighlight=1# 35685) I then noticed that my ILO clones were starting on the 'wrong' nodes. As in the stonith resource to kill node 2 was actually running on node 2; which is pointless if node 2 locks up. So I added resource constraints to force the stonith clone to stay on a node that was not the one to be shot. This seemed to work well. The next issue I have is that when I disconnect the LAN cable on a single node that connects it to the rest of the network the clone stonith monitor will fail since it can't connect to the other nodes ILO for status. After some time (minutes let's say) I reconnect the LAN cable but never see the clone stonith come back to life, just stays failed. What should I be looking at to make sure that the clone stonith restarts properly. Any advice on how to more properly setup an HP ILO stonith in this scenario would be greatly appreciated. (I can see where a clone stonith would be useful in a large cluster of n>2 nodes since all nodes could have a chance to shoot a failed node and maybe this is the reason for cloned stonith with ILO? Basically in a cluster of N nodes each node would be running N-1 stonith resources, ready to shoot a dead node.) Thanks in advance, -ab _______________________________________________ Pacemaker mailing list [email protected] http://list.clusterlabs.org/mailman/listinfo/pacemaker
