On Wed, Aug 14, 2013 at 04:49:42PM +0900, Christian Balzer wrote: > > Package: libvirt0 > Version: 0.9.12-11+deb7u1 > Severity: important > > Hello, > > when doing a live migration using Pacemaker (the OCF VirtualDomain RA) on > a cluster with DRBD (active/active) backing storage everything works fine > with recently started (small memory footprint of about 200MB at most) KVM > guests. > > After inflating one guest to 2GB memory usage (memtester comes in handy > for that) the migration failed after 30 seconds, having managed to migrate > about 400MB in that time over the direct, dedicated GbE link between my > test cluster host nodes. > > libvirtd.log on the migration target node, migration start time is > 07:24:51 : > --- > 2013-08-13 07:24:51.807+0000: 31953: warning : > qemuDomainObjEnterMonitorInternal > :994 : This thread seems to be the async job owner; entering monitor without > ask > ing for a nested job is dangerous > 2013-08-13 07:24:51.886+0000: 31953: warning : > qemuDomainObjEnterMonitorInternal > :994 : This thread seems to be the async job owner; entering monitor without > ask > ing for a nested job is dangerous > 2013-08-13 07:24:51.888+0000: 31953: warning : > qemuDomainObjEnterMonitorInternal > :994 : This thread seems to be the async job owner; entering monitor without > ask > ing for a nested job is dangerous > 2013-08-13 07:24:51.948+0000: 31953: warning : > qemuDomainObjEnterMonitorInternal > :994 : This thread seems to be the async job owner; entering monitor without > ask > ing for a nested job is dangerous > 2013-08-13 07:24:51.948+0000: 31953: warning : > qemuDomainObjEnterMonitorInternal > :994 : This thread seems to be the async job owner; entering monitor without > ask > ing for a nested job is dangerous > 2013-08-13 07:25:21.217+0000: 31950: warning : virKeepAliveTimer:182 : No > response from client 0x1948280 after 5 keepalive messages in 30 seconds > 2013-08-13 07:25:31.224+0000: 31950: warning : qemuProcessKill:3813 : Timed > out waiting after SIGTERM to process 15926, sending SIGKILL
This looks more like you're not replying via the keepalive protocol. What are you using to migrate VMs? -- Guido > --- > > Below is the only thing I could find which is somewhat related to this, > unfortunately it was cured by the miracle that is the next version upgrade > without the root cause being found: > https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=816451 > > I will install Sid on another test cluster tomorrow and am betting that it > will work just fine there. > Since Testing is still at the same level as Wheezy I'm also betting that > we won't see anything in wheezy-backports anytime soon. > I'd really rather not create a production cluster based on Jessie or do > those rather complex backports myself... > > > Regards, > > Christian > -- > Christian Balzer Network/Systems Engineer > ch...@gol.com Global OnLine Japan/Fusion Communications > http://www.gol.com/ > > _______________________________________________ > Pkg-libvirt-maintainers mailing list > pkg-libvirt-maintain...@lists.alioth.debian.org > http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-libvirt-maintainers > -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org