Based on your config, the only reason I can find that the slave doesn't start is that the second node is offline.
On 15 May 2014, at 9:34 am, Andrew Beekhof <[email protected]> wrote: > > On 14 May 2014, at 8:00 pm, Sékine Coulibaly <[email protected]> wrote: > >> Hi Andrew, >> >> I came through some kind of solution, lightly different from what I used in >> my first post. >> You'll find it in the raw cibadmin attached to this post. >> BOUM,UFO,INGESTOR and QUOTAS all are applications and depend on ZK and >> Postgresql. >> >> I'm somewhat stuck with 6.3 release for the moment. Right now I'm >> considering switching to PCS to make the transition to RHEL 6.5 or 7.x >> easier. Transition to RHEL 6.X is a global decision process so I'm afraid >> I'll stay with 6.3 for the moment. > > Ok, but bare in mind that if this turns out to be a bug, I cannot fix it for > 6.3 > Your only way to get the fix is upgrade or build upstream yourself. > >> >> Sekine >> >>> Hi, >>> >>> Let me explain my use case. I'm using RHEL 6.3 >> >> fwiw, there are updates to pacemaker 1.1.10 in 6.4 and 6.5. >> Its even supported now. >> >>> with Corosync + Pacemaker + PostgreSQL9.2 + repmgr 2.0. I have two nodes >>> names clustera and clusterb. >>> >>> I have a total of 3 resources : >>> - APACHE >>> - BOUM >>> - MS_POSTGRESQL >>> >>> They are defined as follow : >>> >>> sudo crm configure primitive APACHE ocf:heartbeat:apache \ >>> params configfile=/etc/httpd/conf/httpd.conf \ >>> op monitor interval=5s timeout=10s \ >>> op start interval=0 timeout=10s \ >>> op stop interval=0 timeout=10s >>> >>> sudo crm configure primitive BOUM ocf:heartbeat:anything \ >>> params binfile=/usr/local/boum/current/bin/boum \ >>> workdir=/var/boum \ >>> logfile=/var/log/boum/boum_STDOUT \ >>> errlogfile=/var/log/boum/boum_STDERR \ >>> pidfile=/var/run/boum.pid \ >>> op monitor interval=5s timeout=10s \ >>> op start interval=0 timeout=10s \ >>> op stop interval=0 timeout=10s >>> >>> sudo crm configure primitive POSTGRESQL ocf:xxxxxx:postgresql \ >>> params repmgr_conf=/var/lib/pgsql/repmgr/repmgr.conf >>> pgctl=/usr/pgsql-9.2/bin/pg_ctl pgdata=/opt/pgdata \ >>> op start interval=0 timeout=90s \ >>> op stop interval=0 timeout=60s \ >>> op promote interval=0 timeout=120s \ >>> op monitor interval=53s role=Master \ >>> op monitor interval=60s role=Slave >>> >>> Since the PostgreSQL is in streaming replication, I need to have a master >>> and a slave constantly running. Hence, I created an MasterSlave resource, >>> called MS_POSTGRESQL. >>> >>> I want to that APACHE, BOUM and the master node of PostgreSQL run >>> altogether on the same node. It looks like that as soon as I add a >>> colocation, the Postgresql slave doesn't start anymore. >>> >>> I end up with : >>> >>> Online: [ clusterb clustera ] >>> >>> Master/Slave Set: MS_POSTGRESQL [POSTGRESQL] >>> Masters: [ clustera ] >>> Stopped: [ POSTGRESQL:1 ] >>> APACHE (ocf::heartbeat:apache): Started clustera >>> BOUM (ocf::heartbeat:anything): Started clustera >>> >>> My configuration is as follows : >>> >>> >>> node clustera \ >>> attributes standby="off" >>> node clusterb \ >>> attributes standby="off" >>> primitive APACHE ocf:heartbeat:apache \ >>> params configfile="/etc/httpd/conf/httpd.conf" \ >>> op monitor interval="5s" timeout="10s" \ >>> op start interval="0" timeout="10s" \ >>> op stop interval="0" timeout="10s" \ >>> meta target-role="Started" >>> primitive BOUM ocf:heartbeat:anything \ >>> params binfile="/usr/local/boum/current/bin/boum" >>> workdir="/var/boum" logfile="/var/log/boum/boum_STDOUT" >>> errlogfile="/var/log/boum/boum_STDERR" pidfile="/var/run/boum.pid" \ >>> op monitor interval="5s" timeout="10s" \ >>> op start interval="0" timeout="10s" \ >>> op stop interval="0" timeout="10s" >>> primitive POSTGRESQL ocf:xxxxxxx:postgresql \ >>> params repmgr_conf="/var/lib/pgsql/repmgr/repmgr.conf" >>> pgctl="/usr/pgsql-9.2/bin/pg_ctl" pgdata="/opt/pgdata" \ >>> op start interval="0" timeout="90s" \ >>> op stop interval="0" timeout="60s" \ >>> op promote interval="0" timeout="120s" \ >>> op monitor interval="53s" role="Master" \ >>> op monitor interval="60s" role="Slave" >>> ms MS_POSTGRESQL POSTGRESQL \ >>> meta clone-max="2" target-role="Started" resource-stickiness="100" >>> notify="true" >>> colocation link-resources inf: ZK UFO BOUM APACHE MS_POSTGRESQL >> >> Could you send the raw xml (cibadmin -Ql) please? >> I've never gotten used to crmsh's colocation syntax and don't have it >> installed locally (pcs is the supplied tool for configuring pacemaker on >> rhel) >> >>> property $id="cib-bootstrap-options" \ >>> dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \ >>> cluster-infrastructure="openais" \ >>> expected-quorum-votes="2" \ >>> stonith-enabled="false" \ >>> no-quorum-policy="ignore" \ >>> default-resource-stickiness="10" \ >>> start-failure-is-fatal="false" \ >>> last-lrm-refresh="1398775386" >>> >>> Is this a normal behaviour ? If it is, is there a workaround I didn't think >>> of ? >> <cibadmin.txt>_______________________________________________ >> Pacemaker mailing list: [email protected] >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
