Package: libvirt0 Version: 0.9.12-11+deb7u1 Severity: important Hello,
when doing a live migration using Pacemaker (the OCF VirtualDomain RA) on a cluster with DRBD (active/active) backing storage everything works fine with recently started (small memory footprint of about 200MB at most) KVM guests. After inflating one guest to 2GB memory usage (memtester comes in handy for that) the migration failed after 30 seconds, having managed to migrate about 400MB in that time over the direct, dedicated GbE link between my test cluster host nodes. libvirtd.log on the migration target node, migration start time is 07:24:51 : --- 2013-08-13 07:24:51.807+0000: 31953: warning : qemuDomainObjEnterMonitorInternal :994 : This thread seems to be the async job owner; entering monitor without ask ing for a nested job is dangerous 2013-08-13 07:24:51.886+0000: 31953: warning : qemuDomainObjEnterMonitorInternal :994 : This thread seems to be the async job owner; entering monitor without ask ing for a nested job is dangerous 2013-08-13 07:24:51.888+0000: 31953: warning : qemuDomainObjEnterMonitorInternal :994 : This thread seems to be the async job owner; entering monitor without ask ing for a nested job is dangerous 2013-08-13 07:24:51.948+0000: 31953: warning : qemuDomainObjEnterMonitorInternal :994 : This thread seems to be the async job owner; entering monitor without ask ing for a nested job is dangerous 2013-08-13 07:24:51.948+0000: 31953: warning : qemuDomainObjEnterMonitorInternal :994 : This thread seems to be the async job owner; entering monitor without ask ing for a nested job is dangerous 2013-08-13 07:25:21.217+0000: 31950: warning : virKeepAliveTimer:182 : No response from client 0x1948280 after 5 keepalive messages in 30 seconds 2013-08-13 07:25:31.224+0000: 31950: warning : qemuProcessKill:3813 : Timed out waiting after SIGTERM to process 15926, sending SIGKILL --- Below is the only thing I could find which is somewhat related to this, unfortunately it was cured by the miracle that is the next version upgrade without the root cause being found: https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=816451 I will install Sid on another test cluster tomorrow and am betting that it will work just fine there. Since Testing is still at the same level as Wheezy I'm also betting that we won't see anything in wheezy-backports anytime soon. I'd really rather not create a production cluster based on Jessie or do those rather complex backports myself... Regards, Christian -- Christian Balzer Network/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org