On Wed, Jun 27, 2018 at 09:22:42PM +0800, Peter Xu wrote: > v3: > - keep the recovery logic even for RDMA by dropping the 3rd patch and > touch up the original 4th patch (current 3rd patch) to suite that [Dave] > > v2: > - break the first patch into several > - fix a QEMUFile leak > > Please review. Thanks, Hi Peter,
I have applied this patchset with upstream Qemu for testing postcopy pause recover feature in PowerPC, I used NFS shared qcow2 between source and target host source: # ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic -vga none \ -machine pseries -m 64G,slots=128,maxmem=128G -smp 16,maxcpus=32 \ -device virtio-blk-pci,drive=rootdisk -drive \ file=/home/bala/sharing/hostos-ppc64le.qcow2,if=none,cache=none,format=qcow2,id=rootdisk \ -monitor telnet:127.0.0.1:1234,server,nowait -net nic,model=virtio \ -net user -redir tcp:2000::22 To keep the VM with workload I ran stress-ng inside guest, # stress-ng --cpu 6 --vm 6 --io 6 target: # ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic -vga none \ -machine pseries -m 64G,slots=128,maxmem=128G -smp 16,maxcpus=32 \ -device virtio-blk-pci,drive=rootdisk -drive \ file=/home/bala/sharing/hostos-ppc64le.qcow2,if=none,cache=none,format=qcow2,id=rootdisk \ -monitor telnet:127.0.0.1:1235,server,nowait -net nic,model=virtio \ -net user -redir tcp:2001::22 -incoming tcp:0:4445 enabled postcopy on both source and destination from qemu monitor (qemu) migrate_set_capability postcopy-ram on >From source qemu monitor, (qemu) migrate -d tcp:10.45.70.203:4445 (qemu) info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off release-ram: off block: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off Migration status: active total time: 2331 milliseconds expected downtime: 300 milliseconds setup: 65 milliseconds transferred ram: 38914 kbytes throughput: 273.16 mbps remaining ram: 67063784 kbytes total ram: 67109120 kbytes duplicate: 1627 pages skipped: 0 pages normal: 9706 pages normal bytes: 38824 kbytes dirty sync count: 1 page size: 4 kbytes multifd bytes: 0 kbytes triggered postcopy from source, (qemu) migrate_start_postcopy After triggering postcopy from source, in target I tried to pause the postcopy migration (qemu) migrate_pause In target I see error as, error while loading state section id 4(ram) qemu-system-ppc64: Detected IO failure for postcopy. Migration paused. In source I see error as, qemu-system-ppc64: Detected IO failure for postcopy. Migration paused. Later from target I try for recovery from target monitor, (qemu) migrate_recover qemu+ssh://10.45.70.203/system Migrate recovery is triggered already but in source still it remains to be in postcopy-paused state (qemu) info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off release-ram: off block: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off Migration status: postcopy-paused total time: 222841 milliseconds expected downtime: 382991 milliseconds setup: 65 milliseconds transferred ram: 385270 kbytes throughput: 265.06 mbps remaining ram: 8150528 kbytes total ram: 67109120 kbytes duplicate: 14679647 pages skipped: 0 pages normal: 63937 pages normal bytes: 255748 kbytes dirty sync count: 2 page size: 4 kbytes multifd bytes: 0 kbytes dirty pages rate: 854740 pages postcopy request count: 374 later I also tried to recover postcopy in source monitor, (qemu) migrate_recover qemu+ssh://10.45.193.21/system Migrate recover can only be run when postcopy is paused. Looks to be it is broken, please help me if I missed something in this test. Thank you, Bala > > Peter Xu (4): > migration: delay postcopy paused state > migration: move income process out of multifd > migration: unbreak postcopy recovery > migration: unify incoming processing > > migration/ram.h | 2 +- > migration/exec.c | 3 --- > migration/fd.c | 3 --- > migration/migration.c | 44 ++++++++++++++++++++++++++++++++++++------- > migration/ram.c | 11 +++++------ > migration/savevm.c | 6 +++--- > migration/socket.c | 5 ----- > 7 files changed, 46 insertions(+), 28 deletions(-) > > -- > 2.17.1 > >
