Hi! It seems your systems run with non-operative fencing, and the cluster wants to fence a node. Maybe bring the cluster to a clean state first, then repeat the test.
Regards, Ulrich >>> "Lentes, Bernd" <[email protected]> schrieb am 03.12.2018 um 16:40 in Nachricht <1271670904.2579771.1543851612288.javamail.zim...@helmholtz-muenchen.de>: > Hi, > > i have a two node cluster with several VirtualDomains as resources. Normally > live migration is no problem. But rarely it fails, without giving any > reasonable > message in the logs. I tried to migrate several VirtualDmains concurrently > from ha-idg-2 to ha-idg-1. One VirtualDomain failed, the others suceeded. > > This happened already once before, i *think* it was also when i tried to > migrate several VirtualDomains concurrently, but i'm not complete sure (it's > already > some time ago). > > But the error appears rarely, it often suceeded without any problems. I > *think* it's related to migrating several domains concurrently. > > node 1: > 2018-12-03T16:03:03.037872+01:00 ha-idg-1 crmd[8615]: warning: Action 44 > (vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): Error > 2018-12-03T16:03:03.038234+01:00 ha-idg-1 crmd[8615]: notice: Transition > 44 aborted by operation vm_mausdb_migrate_to_0 'modify' on ha-idg-2: Event > failed > 2018-12-03T16:03:03.038569+01:00 ha-idg-1 crmd[8615]: warning: Action 44 > (vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): Error > > 2018-12-03T16:03:03.041771+01:00 ha-idg-1 stonith-ng[8611]: warning: > fence_ha-idg-2 has 'action' parameter, which should never be specified in > configuration > 2018-12-03T16:03:03.042112+01:00 ha-idg-1 stonith-ng[8611]: warning: > Mapping action='Off' to pcmk_off_action='Off' > 2018-12-03T16:03:03.042365+01:00 ha-idg-1 stonith-ng[8611]: warning: > Mapping action='Off' to pcmk_reboot_action='Off' > 2018-12-03T16:03:03.208895+01:00 ha-idg-1 sshd[6397]: Received disconnect > from 146.107.235.132 port 46004:11: disconnected by user > 2018-12-03T16:03:03.209367+01:00 ha-idg-1 sshd[6397]: Disconnected from > 146.107.235.132 port 46004 > 2018-12-03T16:03:03.209670+01:00 ha-idg-1 sshd[6397]: > pam_unix(sshd:session): session closed for user root > 2018-12-03T16:03:03.213996+01:00 ha-idg-1 systemd-logind[1627]: Removed > session 17. > 2018-12-03T16:03:03.286986+01:00 ha-idg-1 crmd[8615]: notice: Transition > 44 (Complete=6, Pending=0, Fired=0, Skipped=1, Incomplete=9, > Source=/var/lib/pacemaker/pengine/pe-input-346.bz2): Stopped > 2018-12-03T16:03:03.290762+01:00 ha-idg-1 stonith-ng[8611]: warning: > fence_ha-idg-2 has 'action' parameter, which should never be specified in > configuration > 2018-12-03T16:03:03.291224+01:00 ha-idg-1 stonith-ng[8611]: warning: > Mapping action='Off' to pcmk_off_action='Off' > 2018-12-03T16:03:03.291619+01:00 ha-idg-1 stonith-ng[8611]: warning: > Mapping action='Off' to pcmk_reboot_action='Off' > 2018-12-03T16:03:03.313864+01:00 ha-idg-1 pengine[8614]: warning: > Processing failed op migrate_to for vm_mausdb on ha-idg-2: unknown error (1) > 2018-12-03T16:03:03.314193+01:00 ha-idg-1 pengine[8614]: warning: > Processing failed op migrate_to for vm_mausdb on ha-idg-2: unknown error (1) > 2018-12-03T16:03:03.316485+01:00 ha-idg-1 pengine[8614]: notice: Recover > vm_mausdb#011(Started ha-idg-2 -> ha-idg-1) > 2018-12-03T16:03:03.316805+01:00 ha-idg-1 pengine[8614]: notice: Migrate > vm_geneious#011(Started ha-idg-2 -> ha-idg-1) > 2018-12-03T16:03:03.317914+01:00 ha-idg-1 pengine[8614]: notice: > Calculated transition 45, saving inputs in > /var/lib/pacemaker/pengine/pe-input-347.bz2 > > node2: > 2018-12-03T16:02:32.312598+01:00 ha-idg-2 VirtualDomain(vm_mausdb)[7988]: > INFO: mausdb: Starting live migration to ha-idg-1 (using: virsh > --connect=qemu:///system --quiet migrate --live ma > usdb qemu+ssh://ha-idg-1/system ). > 2018-12-03T16:02:32.315358+01:00 ha-idg-2 VirtualDomain(vm_geneious)[7989]: > INFO: geneious: Starting live migration to ha-idg-1 (using: virsh > --connect=qemu:///system --quiet migrate --live > geneious qemu+ssh://ha-idg-1/system ). > 2018-12-03T16:02:32.391911+01:00 ha-idg-2 lrmd[14256]: notice: executing - > rsc:vm_sim action:stop call_id:78 > 2018-12-03T16:02:32.394065+01:00 ha-idg-2 stonith-ng[14255]: warning: > fence_ha-idg-1 has 'action' parameter, which should never be specified in > configuration > 2018-12-03T16:02:32.394369+01:00 ha-idg-2 stonith-ng[14255]: warning: > Mapping action='Off' to pcmk_off_action='Off' > 2018-12-03T16:02:32.394619+01:00 ha-idg-2 stonith-ng[14255]: warning: > Mapping action='Off' to pcmk_reboot_action='Off' > 2018-12-03T16:02:32.459622+01:00 ha-idg-2 VirtualDomain(vm_sim)[8599]: INFO: > Domain sim already stopped. > 2018-12-03T16:02:32.464042+01:00 ha-idg-2 lrmd[14256]: notice: finished - > rsc:vm_sim action:stop call_id:78 pid:8599 exit-code:0 exec-time:72ms > queue-time:0ms > 2018-12-03T16:02:32.464750+01:00 ha-idg-2 crmd[14259]: notice: Result of > stop operation for vm_sim on ha-idg-2: 0 (ok) > 2018-12-03T16:02:32.470496+01:00 ha-idg-2 stonith-ng[14255]: warning: > fence_ha-idg-1 has 'action' parameter, which should never be specified in > configuration > 2018-12-03T16:02:32.470794+01:00 ha-idg-2 stonith-ng[14255]: warning: > Mapping action='Off' to pcmk_off_action='Off' > 2018-12-03T16:02:32.471043+01:00 ha-idg-2 stonith-ng[14255]: warning: > Mapping action='Off' to pcmk_reboot_action='Off' > 2018-12-03T16:02:32.628562+01:00 ha-idg-2 stonith-ng[14255]: warning: > fence_ha-idg-1 has 'action' parameter, which should never be specified in > configuration > 2018-12-03T16:02:32.628970+01:00 ha-idg-2 stonith-ng[14255]: warning: > Mapping action='Off' to pcmk_off_action='Off' > 2018-12-03T16:02:32.629226+01:00 ha-idg-2 stonith-ng[14255]: warning: > Mapping action='Off' to pcmk_reboot_action='Off' > 2018-12-03T16:03:02.836145+01:00 ha-idg-2 libvirtd[3117]: 2018-12-03 > 15:03:02.835+0000: 4515: error : qemuMigrationCheckJobStatus:1456 : operation > failed: migration job: unexpectedly failed > 2018-12-03T16:03:03.006918+01:00 ha-idg-2 VirtualDomain(vm_mausdb)[7988]: > ERROR: mausdb: live migration to ha-idg-1 failed: 1 > 2018-12-03T16:03:03.032370+01:00 ha-idg-2 kernel: [ 6593.265436] br0: port > 2(vnet0) entered disabled state > 2018-12-03T16:03:03.032383+01:00 ha-idg-2 kernel: [ 6593.266268] device > vnet0 left promiscuous mode > 2018-12-03T16:03:03.032385+01:00 ha-idg-2 kernel: [ 6593.266275] br0: port > 2(vnet0) entered disabled state > 2018-12-03T16:03:03.032789+01:00 ha-idg-2 lrmd[14256]: notice: > vm_mausdb_migrate_to_0:7988:stderr [ error: operation failed: migration job: > unexpectedly failed ] > 2018-12-03T16:03:03.033126+01:00 ha-idg-2 lrmd[14256]: notice: > vm_mausdb_migrate_to_0:7988:stderr [ ocf-exit-reason:mausdb: live migration > to ha-idg-1 failed: 1 ] > 2018-12-03T16:03:03.033376+01:00 ha-idg-2 lrmd[14256]: notice: finished - > rsc:vm_mausdb action:migrate_to call_id:75 pid:7988 exit-code:1 > exec-time:31065ms queue-time:0ms > 2018-12-03T16:03:03.033909+01:00 ha-idg-2 crmd[14259]: notice: Result of > migrate_to operation for vm_mausdb on ha-idg-2: 1 (unknown error) > 2018-12-03T16:03:03.034198+01:00 ha-idg-2 crmd[14259]: notice: > ha-idg-2-vm_mausdb_migrate_to_0:75 [ error: operation failed: migration job: > unexpectedly failed\nocf-exit-reason:mausdb: live migration to ha-idg-1 > failed: 1\n ] > > /var/log/pacemaker.log: > Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng: info: cib_devices_update: > Updating devices to version 1.789.6 > Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng: warning: xml2device_params: > fence_ha-idg-2 has 'action' parameter, which should never be specified in > configuration > Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng: warning: map_action: > Mapping action='Off' to pcmk_off_action='Off' > Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng: warning: map_action: > Mapping action='Off' to pcmk_reboot_action='Off' > Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng: info: cib_device_update: > Device fence_ha-idg-1 has been disabled on ha-idg-1: score=-INFINITY > Dec 03 16:02:37 [8610] ha-idg-1 cib: info: cib_process_ping: > Reporting our current digest to ha-idg-1: 2715b3a8fd056fbd4d84cb12b911a328 > for 1.789.6 (0x1aebf00 0) > Dec 03 16:03:03 [8610] ha-idg-1 cib: info: cib_perform_op: > Diff: --- 1.789.6 2 > Dec 03 16:03:03 [8610] ha-idg-1 cib: info: cib_perform_op: > Diff: +++ 1.789.7 (null) > Dec 03 16:03:03 [8610] ha-idg-1 cib: info: cib_perform_op: + > /cib: @num_updates=7 > Dec 03 16:03:03 [8610] ha-idg-1 cib: info: cib_perform_op: + > /cib/status/node_state[@id='1084777492']/lrm[@id='1084777492']/lrm_resources/ > lrm_resource[@id='vm_mausdb']/lrm_rsc > _op[@id='vm_mausdb_last_0']: > @transition-magic=0:1;44:44:0:e4b76be7-fdb6-47c1-a905-9a49650b4180, > @exit-reason=mausdb: live migration to ha-idg-1 failed: 1, @call-id=75, > @rc-code=1, @op-sta > tus=0, @exec-time=31065 > Dec 03 16:03:03 [8610] ha-idg-1 cib: info: cib_perform_op: ++ > /cib/status/node_state[@id='1084777492']/lrm[@id='1084777492']/lrm_resources/ > lrm_resource[@id='vm_mausdb']: <lrm_ > rsc_op id="vm_mausdb_last_failure_0" operation_key="vm_mausdb_migrate_to_0" > operation="migrate_to" crm-debug-origin="do_update_resource" > crm_feature_set="3.0.13" transition-key="44:44:0:e4b > 76be7-fdb6-47c1-a905-9a49650b4180" > transition-magic="0:1;44:44:0:e4b76be7-fdb6-47c1-a905-9a49650b4180" > exit-reason="mausdb: live migrat > Dec 03 16:03:03 [8610] ha-idg-1 cib: info: cib_process_request: > Completed cib_modify operation for section status: OK (rc=0, > origin=ha-idg-2/crmd/74, version=1.789.7) > Dec 03 16:03:03 [8615] ha-idg-1 crmd: warning: status_from_rc: > Action 44 (vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): > Error > Dec 03 16:03:03 [8615] ha-idg-1 crmd: notice: > abort_transition_graph: Transition 44 aborted by operation > vm_mausdb_migrate_to_0 'modify' on ha-idg-2: Event failed | magic=0:1;44:4 > 4:0:e4b76be7-fdb6-47c1-a905-9a49650b4180 cib=1.789.7 > source=match_graph_event:310 complete=false > Dec 03 16:03:03 [8615] ha-idg-1 crmd: info: match_graph_event: > Action vm_mausdb_migrate_to_0 (44) confirmed on ha-idg-2 (rc=1) > Dec 03 16:03:03 [8615] ha-idg-1 crmd: info: process_graph_event: > Detected action (44.44) vm_mausdb_migrate_to_0.75=unknown error: failed > Dec 03 16:03:03 [8615] ha-idg-1 crmd: warning: status_from_rc: > Action 44 (vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): > Error > Dec 03 16:03:03 [8615] ha-idg-1 crmd: info: > abort_transition_graph: Transition 44 aborted by operation > vm_mausdb_migrate_to_0 'create' on ha-idg-2: Event failed | magic=0:1;44:4 > 4:0:e4b76be7-fdb6-47c1-a905-9a49650b4180 cib=1.789.7 > source=match_graph_event:310 complete=false > Dec 03 16:03:03 [8615] ha-idg-1 crmd: info: match_graph_event: > Action vm_mausdb_migrate_to_0 (44) confirmed on ha-idg-2 (rc=1) > Dec 03 16:03:03 [8615] ha-idg-1 crmd: info: process_graph_event: > Detected action (44.44) vm_mausdb_migrate_to_0.75=unknown error: failed > Dec 03 16:03:03 [8611] ha-idg-1 stonith-ng: info: > update_cib_stonith_devices_v2: Updating device list from the cib: modify > lrm_rsc_op[@id='vm_mausdb_last_0'] > > Hosts are SLES 12 SP3, pacenaker is the most recent for my OS: > pacemaker-1.1.16-6.5.1.x86_64 > > I don't have any clue, i'm thankful for any hint. I can provide you with > further information if needed. > > > Bernd > -- > > Bernd Lentes > Systemadministration > Institut für Entwicklungsgenetik > Gebäude 35.34 - Raum 208 > HelmholtzZentrum münchen > [ mailto:[email protected] | > [email protected] ] > phone: +49 89 3187 1241 > fax: +49 89 3187 2294 > [ http://www.helmholtz-muenchen.de/idg | > http://www.helmholtz-muenchen.de/idg ] > > wer Fehler macht kann etwas lernen > wer nichts macht kann auch nichts lernen > > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDirig.in Petra Steiner-Hoffmann > Stellv.Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter > Geschaeftsfuehrer: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich > Bassler, Dr. rer. nat. Alfons Enhsen > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 > > _______________________________________________ > Users mailing list: [email protected] > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: [email protected] https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
