blank response for thread to appear in mailbox..pls ignore On Tue, May 23, 2017 at 4:21 AM, Ken Gaillot <[email protected]> wrote:
> On 05/16/2017 04:34 AM, Anu Pillai wrote: > > Hi, > > > > Please find attached debug logs for the stated problem as well as > > crm_mon command outputs. > > In this case we are trying to remove/delete res3 and system/node > > (0005B94238BC) from the cluster. > > > > *_Test reproduction steps_* > > > > Current Configuration of the cluster: > > 0005B9423910 - res2 > > 0005B9427C5A - res1 > > 0005B94238BC - res3 > > > > *crm_mon output:* > > > > Defaulting to one-shot mode > > You need to have curses available at compile time to enable console mode > > Stack: corosync > > Current DC: 0005B9423910 (version 1.1.14-5a6cdd1) - partition with quorum > > Last updated: Tue May 16 12:21:23 2017 Last change: Tue May 16 > > 12:13:40 2017 by root via crm_attribute on 0005B9423910 > > > > 3 nodes and 3 resources configured > > > > Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ] > > > > res2 (ocf::redundancy:RedundancyRA): Started 0005B9423910 > > res1 (ocf::redundancy:RedundancyRA): Started 0005B9427C5A > > res3 (ocf::redundancy:RedundancyRA): Started 0005B94238BC > > > > > > Trigger the delete operation for res3 and node 0005B94238BC. > > > > Following commands applied from node 0005B94238BC > > $ pcs resource delete res3 --force > > $ crm_resource -C res3 > > $ pcs cluster stop --force > > I don't think "pcs resource delete" or "pcs cluster stop" does anything > with the --force option. In any case, --force shouldn't be needed here. > > The crm_mon output you see is actually not what it appears. It starts with: > > May 16 12:21:27 [4661] 0005B9423910 crmd: notice: do_lrm_invoke: > Forcing the status of all resources to be redetected > > This is usually the result of a "cleanup all" command. It works by > erasing the resource history, causing pacemaker to re-probe all nodes to > get the current state. The history erasure makes it appear to crm_mon > that the resources are stopped, but they actually are not. > > In this case, I'm not sure why it's doing a "cleanup all", since you > only asked it to cleanup res3. Maybe in this particular instance, you > actually did "crm_resource -C"? > > > Following command applied from DC(0005B9423910) > > $ crm_node -R 0005B94238BC --force > > This can cause problems. This command shouldn't be run unless the node > is removed from both pacemaker's and corosync's configuration. If you > actually are trying to remove the node completely, a better alternative > would be "pcs cluster node remove 0005B94238BC", which will handle all > of that for you. If you're not trying to remove the node completely, > then you shouldn't need this command at all. > > > > > > > *crm_mon output:* > > * > > * > > Defaulting to one-shot mode > > You need to have curses available at compile time to enable console mode > > Stack: corosync > > Current DC: 0005B9423910 (version 1.1.14-5a6cdd1) - partition with quorum > > Last updated: Tue May 16 12:21:27 2017 Last change: Tue May 16 > > 12:21:26 2017 by root via cibadmin on 0005B94238BC > > > > 3 nodes and 2 resources configured > > > > Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ] > > > > > > Observation is remaining two resources res2 and res1 were stopped and > > started. > > > > > > Regards, > > Aswathi > > > > On Mon, May 15, 2017 at 8:11 PM, Ken Gaillot <[email protected] > > <mailto:[email protected]>> wrote: > > > > On 05/15/2017 06:59 AM, Klaus Wenninger wrote: > > > On 05/15/2017 12:25 PM, Anu Pillai wrote: > > >> Hi Klaus, > > >> > > >> Please find attached cib.xml as well as corosync.conf. > > > > Maybe you're only setting this while testing, but having > > stonith-enabled=false and no-quorum-policy=ignore is highly > dangerous in > > any kind of network split. > > > > FYI, default-action-timeout is deprecated in favor of setting a > timeout > > in op_defaults, but it doesn't hurt anything. > > > > > Why wouldn't you keep placement-strategy with default > > > to keep things simple. You aren't using any load-balancing > > > anyway as far as I understood it. > > > > It looks like the intent is to use placement-strategy to limit each > node > > to 1 resource. The configuration looks good for that. > > > > > Haven't used resource-stickiness=INF. No idea which strange > > > behavior that triggers. Try to have it just higher than what > > > the other scores might some up to. > > > > Either way would be fine. Using INFINITY ensures that no other > > combination of scores will override it. > > > > > I might have overseen something in your scores but otherwise > > > there is nothing obvious to me. > > > > > > Regards, > > > Klaus > > > > I don't see anything obvious either. If you have logs around the > time of > > the incident, that might help. > > > > >> Regards, > > >> Aswathi > > >> > > >> On Mon, May 15, 2017 at 2:46 PM, Klaus Wenninger < > [email protected] <mailto:[email protected]> > > >> <mailto:[email protected] <mailto:[email protected]>>> wrote: > > >> > > >> On 05/15/2017 09:36 AM, Anu Pillai wrote: > > >> > Hi, > > >> > > > >> > We are running pacemaker cluster for managing our > resources. We > > >> have 6 > > >> > system running 5 resources and one is acting as standby. We > have a > > >> > restriction that, only one resource can run in one node. > But our > > >> > observation is whenever we add or delete a resource from > cluster all > > >> > the remaining resources in the cluster are stopped and > started back. > > >> > > > >> > Can you please guide us whether this normal behavior or we > are > > >> missing > > >> > any configuration that is leading to this issue. > > >> > > >> It should definitely be possible to prevent this behavior. > > >> If you share your config with us we might be able to > > >> track that down. > > >> > > >> Regards, > > >> Klaus > > >> > > >> > > > >> > Regards > > >> > Aswathi >
_______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
