Hi Ken, Did you get any chance to go through the logs? Do you need any more details ?
Regards, Aswathi On Tue, May 16, 2017 at 3:04 PM, Anu Pillai <[email protected]> wrote: > Hi, > > Please find attached debug logs for the stated problem as well as crm_mon > command outputs. > In this case we are trying to remove/delete res3 and system/node ( > 0005B94238BC) from the cluster. > > *Test reproduction steps* > > Current Configuration of the cluster: > 0005B9423910 - res2 > 0005B9427C5A - res1 > 0005B94238BC - res3 > > *crm_mon output:* > > Defaulting to one-shot mode > You need to have curses available at compile time to enable console mode > Stack: corosync > Current DC: 0005B9423910 (version 1.1.14-5a6cdd1) - partition with quorum > Last updated: Tue May 16 12:21:23 2017 Last change: Tue May 16 > 12:13:40 2017 by root via crm_attribute on 0005B9423910 > > 3 nodes and 3 resources configured > > Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ] > > res2 (ocf::redundancy:RedundancyRA): Started 0005B9423910 > res1 (ocf::redundancy:RedundancyRA): Started 0005B9427C5A > res3 (ocf::redundancy:RedundancyRA): Started 0005B94238BC > > > Trigger the delete operation for res3 and node 0005B94238BC. > > Following commands applied from node 0005B94238BC > $ pcs resource delete res3 --force > $ crm_resource -C res3 > $ pcs cluster stop --force > > Following command applied from DC(0005B9423910) > $ crm_node -R 0005B94238BC --force > > > *crm_mon output:* > > Defaulting to one-shot mode > You need to have curses available at compile time to enable console mode > Stack: corosync > Current DC: 0005B9423910 (version 1.1.14-5a6cdd1) - partition with quorum > Last updated: Tue May 16 12:21:27 2017 Last change: Tue May 16 > 12:21:26 2017 by root via cibadmin on 0005B94238BC > > 3 nodes and 2 resources configured > > Online: [ 0005B94238BC 0005B9423910 0005B9427C5A ] > > > Observation is remaining two resources res2 and res1 were stopped and > started. > > > Regards, > Aswathi > > On Mon, May 15, 2017 at 8:11 PM, Ken Gaillot <[email protected]> wrote: > >> On 05/15/2017 06:59 AM, Klaus Wenninger wrote: >> > On 05/15/2017 12:25 PM, Anu Pillai wrote: >> >> Hi Klaus, >> >> >> >> Please find attached cib.xml as well as corosync.conf. >> >> Maybe you're only setting this while testing, but having >> stonith-enabled=false and no-quorum-policy=ignore is highly dangerous in >> any kind of network split. >> >> FYI, default-action-timeout is deprecated in favor of setting a timeout >> in op_defaults, but it doesn't hurt anything. >> >> > Why wouldn't you keep placement-strategy with default >> > to keep things simple. You aren't using any load-balancing >> > anyway as far as I understood it. >> >> It looks like the intent is to use placement-strategy to limit each node >> to 1 resource. The configuration looks good for that. >> >> > Haven't used resource-stickiness=INF. No idea which strange >> > behavior that triggers. Try to have it just higher than what >> > the other scores might some up to. >> >> Either way would be fine. Using INFINITY ensures that no other >> combination of scores will override it. >> >> > I might have overseen something in your scores but otherwise >> > there is nothing obvious to me. >> > >> > Regards, >> > Klaus >> >> I don't see anything obvious either. If you have logs around the time of >> the incident, that might help. >> >> >> Regards, >> >> Aswathi >> >> >> >> On Mon, May 15, 2017 at 2:46 PM, Klaus Wenninger <[email protected] >> >> <mailto:[email protected]>> wrote: >> >> >> >> On 05/15/2017 09:36 AM, Anu Pillai wrote: >> >> > Hi, >> >> > >> >> > We are running pacemaker cluster for managing our resources. We >> >> have 6 >> >> > system running 5 resources and one is acting as standby. We have >> a >> >> > restriction that, only one resource can run in one node. But our >> >> > observation is whenever we add or delete a resource from cluster >> all >> >> > the remaining resources in the cluster are stopped and started >> back. >> >> > >> >> > Can you please guide us whether this normal behavior or we are >> >> missing >> >> > any configuration that is leading to this issue. >> >> >> >> It should definitely be possible to prevent this behavior. >> >> If you share your config with us we might be able to >> >> track that down. >> >> >> >> Regards, >> >> Klaus >> >> >> >> > >> >> > Regards >> >> > Aswathi >> >> _______________________________________________ >> Users mailing list: [email protected] >> http://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > >
_______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
