26.09.2020 12:22, Michael Ivanov пишет: > Hallo, > > I have strange problem: when I reset the node on which my resources are > running, > they are correctly migrated to the other node. But when I turn the failed > node > back, then as soon as it is up all resources are returned back to it. I have > set > resource-stickiness default value to 100. When this did not help I have set > up > resource-stickiness meta attr also to 100 for all my resources. Still when > the > failed node recovers the resources are migrated back to it! Where should I > look > to try to understand this situation? >
The first thing to check are location and colocation constraints. > Here's the configuration of my cluster: > > root@node1# pcs status > Cluster name: gcluster > Cluster Summary: > * Stack: corosync > * Current DC: node1 (version 2.0.4-2deceaa3ae) - partition with quorum > * Last updated: Sat Sep 26 11:12:34 2020 > * Last change: Sat Sep 26 10:39:16 2020 by root via cibadmin on node1 > * 2 nodes configured > * 14 resource instances configured (1 DISABLED) > > Node List: > * Online: [ node1 node2 ] > > Full List of Resources: > * ilo5_node1 (stonith:fence_ilo5_ssh): Started node2 > * ilo5_node2 (stonith:fence_ilo5_ssh): Started node1 > * Resource Group: VirtIP: > * PrimaryIP (ocf::heartbeat:IPaddr2): Started node2 > * PrimaryIP6 (ocf::heartbeat:IPv6addr): Started node2 > * AliasIP (ocf::heartbeat:IPaddr2): Started node2 > * BackupFS (ocf::redhat:netfs.sh): Started node2 > * Clone Set: MailVolume-clone [MailVolume] (promotable): > * Masters: [ node2 ] > * Slaves: [ node1 ] > * MailFS (ocf::heartbeat:Filesystem): Started node2 > * apache (ocf::heartbeat:apache): Started node2 > * postfix (ocf::heartbeat:postfix): Started node2 > * amavis (service:amavis): Started node2 > * dovecot (service:dovecot): Started node2 > * openvpn (service:openvpn): Stopped (disabled) > > And resources: > > root@node1# pcs resource config > Group: VirtIP > Meta Attrs: resource-stickiness=100 > Resource: PrimaryIP (class=ocf provider=heartbeat type=IPaddr2) > Attributes: cidr_netmask=16 ip=xx.xx.xx.20 nic=br0 > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=30s (PrimaryIP-monitor-interval-30s) > start interval=0s timeout=20s (PrimaryIP-start-interval-0s) > stop interval=0s timeout=20s (PrimaryIP-stop-interval-0s) > Resource: PrimaryIP6 (class=ocf provider=heartbeat type=IPv6addr) > Attributes: cidr_netmask=64 ipv6addr=xxxx:xxxx:xxxx:xxxx:0:0:0:20 nic=br0 > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=30s (PrimaryIP6-monitor-interval-30s) > start interval=0s timeout=15s (PrimaryIP6-start-interval-0s) > stop interval=0s timeout=15s (PrimaryIP6-stop-interval-0s) > Resource: AliasIP (class=ocf provider=heartbeat type=IPaddr2) > Attributes: cidr_netmask=16 ip=xx.xx.yy.20 nic=br0 > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=30s (AliasIP-monitor-interval-30s) > start interval=0s timeout=20s (AliasIP-start-interval-0s) > stop interval=0s timeout=20s (AliasIP-stop-interval-0s) > Resource: BackupFS (class=ocf provider=redhat type=netfs.sh) > Attributes: export=/Backup/Gateway fstype=nfs host=atlas > mountpoint=/Backup > options=noatime,async > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=1m timeout=10 (BackupFS-monitor-interval-1m) > monitor interval=5m timeout=30 OCF_CHECK_LEVEL=10 > (BackupFS-monitor-interval-5m) > monitor interval=10m timeout=30 OCF_CHECK_LEVEL=20 > (BackupFS-monitor-interval-10m) > start interval=0s timeout=900 (BackupFS-start-interval-0s) > stop interval=0s timeout=30 (BackupFS-stop-interval-0s) > Clone: MailVolume-clone > Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true > promoted-max=1 promoted-node-max=1 resource-stickiness=100 > Resource: MailVolume (class=ocf provider=linbit type=drbd) > Attributes: drbd_resource=mail > Meta Attrs: resource-stickiness=100 > Operations: demote interval=0s timeout=90 (MailVolume-demote-interval-0s) > monitor interval=60s (MailVolume-monitor-interval-60s) > notify interval=0s timeout=90 (MailVolume-notify-interval-0s) > promote interval=0s timeout=90 > (MailVolume-promote-interval-0s) > reload interval=0s timeout=30 (MailVolume-reload-interval-0s) > start interval=0s timeout=240 (MailVolume-start-interval-0s) > stop interval=0s timeout=100 (MailVolume-stop-interval-0s) > Resource: MailFS (class=ocf provider=heartbeat type=Filesystem) > Attributes: device=/dev/drbd0 directory=/var/mail fstype=btrfs > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=20s timeout=40s (MailFS-monitor-interval-20s) > start interval=0s timeout=60s (MailFS-start-interval-0s) > stop interval=0s timeout=60s (MailFS-stop-interval-0s) > Resource: apache (class=ocf provider=heartbeat type=apache) > Attributes: client=wget statusurl=https://localhost/server-status > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=1min (apache-monitor-interval-1min) > start interval=0s timeout=40s (apache-start-interval-0s) > stop interval=0s timeout=60s (apache-stop-interval-0s) > Resource: postfix (class=ocf provider=heartbeat type=postfix) > Meta Attrs: resource-stickiness=100 > Operations: monitor interval=60s timeout=20s (postfix-monitor-interval-60s) > reload interval=0s timeout=20s (postfix-reload-interval-0s) > start interval=0s timeout=20s (postfix-start-interval-0s) > stop interval=0s timeout=20s (postfix-stop-interval-0s) > Resource: amavis (class=service type=amavis) > Meta Attrs: resource-stickiness=100 > Operations: force-reload interval=0s timeout=15 > (amavis-force-reload-interval-0s) > monitor interval=15 timeout=15 (amavis-monitor-interval-15) > restart interval=0s timeout=15 (amavis-restart-interval-0s) > start interval=0s timeout=15 (amavis-start-interval-0s) > stop interval=0s timeout=15 (amavis-stop-interval-0s) > Resource: dovecot (class=service type=dovecot) > Meta Attrs: resource-stickiness=100 > Operations: force-reload interval=0s timeout=15 > (dovecot-force-reload-interval-0s) > monitor interval=15 timeout=15 (dovecot-monitor-interval-15) > restart interval=0s timeout=15 (dovecot-restart-interval-0s) > start interval=0s timeout=15 (dovecot-start-interval-0s) > stop interval=0s timeout=15 (dovecot-stop-interval-0s) > Resource: openvpn (class=service type=openvpn) > Meta Attrs: resource-stickiness=100 target-role=Stopped > Operations: force-reload interval=0s timeout=15 > (openvpn-force-reload-interval-0s) > monitor interval=15 timeout=15 (openvpn-monitor-interval-15) > restart interval=0s timeout=15 (openvpn-restart-interval-0s) > start interval=0s timeout=15 (openvpn-start-interval-0s) > stop interval=0s timeout=15 (openvpn-stop-interval-0s) > > drbd resource is configured as follows: > > root@node1# cat /etc/drbd.d/mail.res > resource mail { > protocol B; > device /dev/drbd0; > disk /dev/sys/mail; > meta-disk internal; > > net { > csums-alg sha1; > after-sb-0pri discard-zero-changes; > after-sb-1pri discard-secondary; > after-sb-2pri disconnect; > rr-conflict disconnect; > } > > handlers { > fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; > after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; > split-brain "/usr/lib/drbd/notify-split-brain.sh > [email protected]"; > } > > on node1 { > address 192.168.0.102:7789; > } > on node2 { > address 192.168.0.103:7789; > } > } > > Best regards, > > -- > \ / | | > (OvO) | Mikhail Iwanow | > (^^^) | | > \^/ | E-mail:[email protected] | > ^ ^ | | > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
