On Wed, Oct 29, 2008 at 12:51:44PM -0400, Aaron Bush wrote: > I have a 0.6 pacemaker/heartbeat cluster setup in a lab with resources > as follows: > > Group-lvs(ordered): two primitives -> ocf/IPddr2 and ocf/ldirectord. > Clone-pingd: set to monitor a couple of Ips and used to set a weight for > where to run the LVS group. > > -- This is the area that I have a question on -- > Clone-stonith-node1: HP ILO to shoot node1 > Clone-stonith-node2: HP ILO to shoot node2 > > I read on the old linux-ha site that using a clone for ILO/stonith was > the way to go. I'm not sure I see how this would work correctly and be > preferred over a standard resource. What I am confused about is this: > the external/riloe stonith plugin only knows how to shoot one node so
Please make sure that you use the latest edition of external/riloe. The previous one didn't work under all circumstances. Thanks, Dejan > why would you want to run it as a clone since each external/riloe is > configured differently. I went ahead and configured the riloe's as > clones feeling that the docs are correct and that the reason would > become obvious to me later. (I also saw a similar post with no > response: > http://www.gossamer-threads.com/lists/linuxha/users/35685?nohighlight=1# > 35685) > > I then noticed that my ILO clones were starting on the 'wrong' nodes. > As in the stonith resource to kill node 2 was actually running on node > 2; which is pointless if node 2 locks up. So I added resource > constraints to force the stonith clone to stay on a node that was not > the one to be shot. This seemed to work well. > > The next issue I have is that when I disconnect the LAN cable on a > single node that connects it to the rest of the network the clone > stonith monitor will fail since it can't connect to the other nodes ILO > for status. After some time (minutes let's say) I reconnect the LAN > cable but never see the clone stonith come back to life, just stays > failed. What should I be looking at to make sure that the clone stonith > restarts properly. > > Any advice on how to more properly setup an HP ILO stonith in this > scenario would be greatly appreciated. (I can see where a clone stonith > would be useful in a large cluster of n>2 nodes since all nodes could > have a chance to shoot a failed node and maybe this is the reason for > cloned stonith with ILO? Basically in a cluster of N nodes each node > would be running N-1 stonith resources, ready to shoot a dead node.) > > Thanks in advance, > -ab > > > _______________________________________________ > Pacemaker mailing list > [email protected] > http://list.clusterlabs.org/mailman/listinfo/pacemaker _______________________________________________ Pacemaker mailing list [email protected] http://list.clusterlabs.org/mailman/listinfo/pacemaker
