On 04/03/2018 06:07 AM, 范国腾 wrote:
> Yes, my resource are started and they are slave status. So I run "pcs 
> resource cleanup pgsql-ha" command. The log shows the error when I run this 
> command.
>
> -----邮件原件-----
> 发件人: Users [mailto:[email protected]] 代表 Andrei Borzenkov
> 发送时间: 2018年4月3日 12:00
> 收件人: [email protected]
> 主题: Re: [ClusterLabs] How to setup a simple master/slave cluster in two nodes 
> without stonith resource
>
> 03.04.2018 05:07, 范国腾 пишет:
>> Hello,
>>
>> I want to setup a cluster in two nodes. One is master and the other is 
>> slave. I don’t need the fencing device because my internal network is 
>> stable.  I use the following command to create the resource, but all of the 
>> two nodes are slave and cluster don’t promote it to master. Could you please 
>> help check if there is anything wrong with my configuration?
What is the reason why you are using a cluster? Someone ripping out a
network-cable
isn't the only reason why one node might not see the other node. In
addition some
kind of fencing might be useful if a node isn't able to get it's
resources under
control.

>>
>> pcs property set stonith-enabled=false; pcs resource create pgsqld 
>> ocf:heartbeat:pgsqlms bindir=/usr/local/pgsql/bin 
>> pgdata=/home/postgres/data op start timeout=600s op stop timeout=60s 
>> op promote timeout=300s op demote timeout=120s op monitor interval=15s 
>> timeout=100s role="Master" op monitor interval=16s timeout=100s 
>> role="Slave" op notify timeout=60s;pcs resource master pgsql-ha pgsqld 
>> notify=true interleave=true;
>>
>> The status is as below:
>>
>> [root@node1 ~]# pcs status
>> Cluster name: cluster_pgsql
>> Stack: corosync
>> Current DC: node2-1 (version 1.1.15-11.el7-e174ec8) - partition with quorum
>> Last updated: Mon Apr  2 21:51:57 2018          Last change: Mon Apr  2 
>> 21:32:22 2018 by hacluster via crmd on node2-1
>>
>> 2 nodes and 3 resources configured
>>
>> Online: [ node1-1 node2-1 ]
>>
>> Full list of resources:
>>
>> Master/Slave Set: pgsql-ha [pgsqld]
>>      Slaves: [ node1-1 node2-1 ]
>> pgsql-master-ip        (ocf::heartbeat:IPaddr2):       Stopped
>>
>> Daemon Status:
>>   corosync: active/disabled
>>   pacemaker: active/disabled
>>   pcsd: active/enabled
>>
>> When I execute pcs resource cleanup in one node, there is always one node 
>> print the following waring message in the /var/log/messages. But the other 
>> nodes’ log show no error. The resource log(pgsqlms) show the monitor action 
>> could return 0 but why the crmd log show failed?
>>
>> Apr  2 21:53:09 node2 crmd[2425]: warning: No reason to expect node 1 
>> to be down Apr  2 21:53:09 node2 crmd[2425]:  notice: State transition 
>> S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL 
>> origin=abort_transition_graph Apr  2 21:53:09 node2 crmd[2425]: warning: No 
>> reason to expect node 2 to be down
>> Apr  2 21:53:09 node2 pengine[2424]:  notice: Start   pgsqld:0#011(node1-1)
>> Apr  2 21:53:09 node2 pengine[2424]:  notice: Start   pgsqld:1#011(node2-1)
>> Apr  2 21:53:09 node2 pengine[2424]:  notice: Calculated transition 4, 
>> saving inputs in /var/lib/pacemaker/pengine/pe-input-6.bz2
>> Apr  2 21:53:09 node2 crmd[2425]:  notice: Initiating monitor 
>> operation pgsqld:0_monitor_0 on node1-1 | action 2 Apr  2 21:53:09 
>> node2 crmd[2425]:  notice: Initiating monitor operation 
>> pgsqld:1_monitor_0 locally on node2-1 | action 3 Apr  2 21:53:09 node2 
>> pgsqlms(pgsqld)[3644]: INFO: Action is monitor Apr  2 21:53:09 node2 
>> pgsqlms(pgsqld)[3644]: INFO: pgsql_monitor: monitor is a probe Apr  2 
>> 21:53:09 node2 pgsqlms(pgsqld)[3644]: INFO: pgsql_monitor: instance 
>> "pgsqld" is listening Apr  2 21:53:09 node2 pgsqlms(pgsqld)[3644]: 
>> INFO: Action result is 0 Apr  2 21:53:09 node2 crmd[2425]:  notice: 
>> Result of probe operation for pgsqld on node2-1: 0 (ok) | call=33 
>> key=pgsqld_monitor_0 confirmed=true cib-update=62 Apr  2 21:53:09 
>> node2 crmd[2425]: warning: Action 3 (pgsqld:1_monitor_0) on node2-1 
>> failed (target: 7 vs. rc: 0): Error Apr  2 21:53:09 node2 crmd[2425]:  
>> notice: Transition aborted by operation pgsqld_monitor_0 'create' on 
>> node2-1: Event failed | 
>> magic=0:0;3:4:7:3a132f28-d8b9-4948-bb6b-736edc221664 cib=0.28.2 
>> source=match_graph_event:310 complete=false Apr  2 21:53:09 node2 
>> crmd[2425]: warning: Action 3 (pgsqld:1_monitor_0) on node2-1 failed 
>> (target: 7 vs. rc: 0): Error Apr  2 21:53:09 node2 crmd[2425]: 
>> warning: Action 2 (pgsqld:0_monitor_0) on node1-1 failed (target: 7 
>> vs. rc: 0): Error Apr  2 21:53:09 node2 crmd[2425]: warning: Action 2 
>> (pgsqld:0_monitor_0) on node1-1 failed (target: 7 vs. rc: 0): Error
> Apparently your applications are already started on both nodes at the time 
> you start pacemaker. Pacemaker expects resources to be in inactive state 
> initially.

Not necessarily I would say. Isn't that why they are probed on startup?
Though this probing somehow seems to fail here.

Regards,
Klaus

>
>> Apr  2 21:53:09 node2 crmd[2425]:  notice: Transition 4 (Complete=4, 
>> Pending=0, Fired=0, Skipped=0, Incomplete=10, 
>> Source=/var/lib/pacemaker/pengine/pe-input-6.bz2): Complete Apr  2 
>> 21:53:09 node2 pengine[2424]:  notice: Calculated transition 5, saving 
>> inputs in /var/lib/pacemaker/pengine/pe-input-7.bz2
>> Apr  2 21:53:09 node2 crmd[2425]:  notice: Initiating monitor 
>> operation pgsqld_monitor_16000 locally on node2-1 | action 4 Apr  2 
>> 21:53:09 node2 crmd[2425]:  notice: Initiating monitor operation 
>> pgsqld_monitor_16000 on node1-1 | action 7 Apr  2 21:53:09 node2 
>> pgsqlms(pgsqld)[3663]: INFO: Action is monitor
>>
>>
>>
>> _______________________________________________
>> Users mailing list: [email protected] 
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org Getting started: 
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
> _______________________________________________
> Users mailing list: [email protected] 
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> _______________________________________________
> Users mailing list: [email protected]
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: [email protected]
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to