On Thu, May 09, 2019 at 10:33:19AM +0800, Peter Xu wrote: > On Wed, May 08, 2019 at 01:55:07PM +0200, Paolo Bonzini wrote: > > On 08/05/19 06:39, Peter Xu wrote: > > >> The disadvantage of this is that you won't clear in the kernel those > > >> dirty bits that come from other sources (e.g. vhost or > > >> address_space_map). This can lead to double-copying of pages. > > >> > > >> Migration already makes a local copy in rb->bmap, and > > >> memory_region_snapshot_and_clear_dirty can also do the clear. Would it > > >> be possible to invoke the clear using rb->bmap instead of the KVMSlot's > > >> new bitmap? > > > > > > Actually that's what I did in the first version before I post the work > > > but I noticed that there seems to have a race condition with the > > > design. The problem is we have multiple copies of the same dirty > > > bitmap from KVM and the race can happen with those multiple users > > > (bitmaps of the users can be a merged version containing KVM and other > > > sources like vhost, address_space_map, etc. but let's just make it > > > simpler to not have them yet). > > > > I see now. And in fact the same double-copying inefficiency happens > > already without this series, so you are improving the situation anyway. > > > > Have you done any kind of benchmarking already? > > Not yet. I posted the series for some initial reviews first before > moving on with performance tests. > > My plan of the test scenario could be: > > - find a guest with relatively large memory (I would guess it is > better to have memory like 64G or even more to make some big > difference) > > - run random dirty memory workload upon most of the mem, with dirty > rate X Bps. > > - setup the migration bandwidth to Y Bps (Y should be bigger than X > but not that big. One could be X=800M and Y=1G to emulate 10G nic > with a workload that we can still converge with precopy only) and > start precopy migration. > > - measure total migration time with CLEAR_LOG on & off. We should > expect the guest to have these with CLEAR_LOG: (1) not hang during > log_sync, and (2) migration should complete faster.
Some updates on performance numbers. Summary: the ideal case below shows ~40% or even more time reduced to migrate the same VM with the same workload. In other words, it could be seen as ~40% faster than before. Test environment: 13G guest, 10G test mem (so I leave 3G untouched), dirty rate 900MB/s, BW 10Gbps to emulate ixgbe, downtime 100ms. IO pattern: I pre-fault all the 10G mem then I do random writes (with command "mig_mon mm_dirty 10240 900 random" [1]) upon the test memory with a constant dirty rate (900MB/s, as mentioned). Then I migrate during the IOs. Here's the total migration time of such VM (for each scenario, I run the migration 5 times then I get an average migration total time used): |--------------+---------------------+-------------| | scenario | migration times (s) | average (s) | |--------------+---------------------+-------------| | no CLEAR_LOG | 55, 54, 56, 74, 54 | 58 | | 1G chunk | 40, 39, 41, 39, 40 | 40 | | 128M chunk | 38, 40, 37, 40, 38 | 38 | | 16M chunk | 42, 40, 38, 41, 38 | 39 | | 1M chunk | 37, 40, 36, 40, 39 | 38 | |--------------+---------------------+-------------| The first "no CLEAR_LOG" means the master branch which still uses the GET_DIRTY only. The latter four scenarios are all with the new CLEAR_LOG interface, aka, this series. The test result shows that 128M chunk size seems to be a good default value instead of 1G (which this series used). I'll adjust that accordingly when I post the next version. [1] https://github.com/xzpeter/clibs/blob/master/bsd/mig_mon/mig_mon.c Regards, -- Peter Xu