On 11/04/2015 05:19 PM, Dr. David Alan Gilbert wrote: > * Wen Congyang ([email protected]) wrote: >> On 11/04/2015 05:05 PM, Dr. David Alan Gilbert wrote: >>> * Wen Congyang ([email protected]) wrote: >>>> On 11/03/2015 09:47 PM, Dr. David Alan Gilbert wrote: >>>>> * Juan Quintela ([email protected]) wrote: >>>>>> "Dr. David Alan Gilbert" <[email protected]> wrote: >>>>>>> Hi, >>>>>>> I'm trying to understand why migration_bitmap_extend is correct/safe; >>>>>>> If I understand correctly, you're arguing that: >>>>>>> >>>>>>> 1) the migration_bitmap_mutex around the extend, stops any sync's >>>>>>> happening >>>>>>> and so no new bits will be set during the extend. >>>>>>> >>>>>>> 2) If migration sends a page and clears a bitmap entry, it doesn't >>>>>>> matter if we lose the 'clear' because we're copying it as >>>>>>> we extend it, because losing the clear just means the page >>>>>>> gets resent, and so the data is OK. >>>>>>> >>>>>>> However, doesn't (2) mean that migration_dirty_pages might be wrong? >>>>>>> If a page was sent, the bit cleared, and migration_dirty_pages >>>>>>> decremented, >>>>>>> then if we copy over that bitmap and 'set' that bit again then >>>>>>> migration_dirty_pages >>>>>>> is too small; that means that either migration would finish too early, >>>>>>> or more likely, migration_dirty_pages would wrap-around -ve and >>>>>>> never finish. >>>>>>> >>>>>>> Is there a reason it's really safe? >>>>>> >>>>>> No. It is reasonably safe. Various values of reasonably. >>>>>> >>>>>> migration_dirty_pages should never arrive at values near zero. Because >>>>>> we move to the completion stage way before it gets a value near zero. >>>>>> (We could have very, very bad luck, as in it is not safe). >>>>> >>>>> That's only true if we hit the qemu_file_rate_limit() in ram_save_iterate; >>>>> if we don't hit the rate limit (e.g. because we're CPU or network limited >>>>> to slower than the set limit) then I think ram_save_iterate will go all >>>>> the >>>>> way to sending every page; if that happens it'll go once more >>>>> around the main migration loop, and call the pending routine, and now get >>>>> a -ve (very +ve) number of pending pages, so continuously do >>>>> ram_save_iterate >>>>> again. >>>>> >>>>> We've had that type of bug before when we messed up the dirty-pages >>>>> calculation >>>>> during hotplug. >>>> >>>> IIUC, migration_bitmap_extend() is called when migration is running, and >>>> we hotplug >>>> a device. >>>> >>>> In this case, I think we hold the iothread mutex when >>>> migration_bitmap_extend() is called. >>>> >>>> ram_save_complete() is also protected by the iothread mutex. >>>> >>>> So if migration_bitmap_extend() is called, the migration thread may be >>>> blocked in >>>> migration_completion() and wait it. qemu_savevm_state_complete() will be >>>> called after >>>> migration_completion() returns. >>> >>> But I don't think ram_save_iterate is protected by that lock, and my concern >>> is that the dirty-pages calculation is wrong during the iteration phase, >>> and then >>> the iteration phase will never exit and never try and get to >>> ram_save_complete. >> >> Yes, the dirty-pages may be wrong. But it is smaller, not larger than the >> exact value. >> Why will the iteration phase never exit? > > Imagine that migration_dirty_pages is slightly too small and we enter > ram_save_iterate; > ram_save_iterate now sends *all* it's pages, it decrements > migration_dirty_pages for > every page sent. At the end of ram_save_iterate, migration_dirty_pages would > be negative. > But migration_dirty_pages is *u*int64_t; so we exit ram_save_iterate, > go around the main migration_thread loop again and call > qemu_savevm_state_pending, and > it returns a very large number (because it's actually a negative number), so > we keep > going around the loop, because it never gets smaller.
I don't know how to trigger the problem. I think store migration_dirty_pages in BitmapRcu can fix this problem. Thanks Wen Congyang > > Dave > >> >> Thanks >> Wen Congyang >> >>> >>> Dave >>> >>>> >>>> Thanks >>>> Wen Congyang >>>> >>>>> >>>>>> Now, do we really care if migration_dirty_pages is exact? Not really, >>>>>> we just use it to calculate if we should start the throotle or not. >>>>>> That only test that each 1 second, so if we have written a couple of >>>>>> pages that we are not accounting for, things should be reasonably safe. >>>>>> >>>>>> Once told that, I don't know why we didn't catch that problem during >>>>>> review (yes, I am guilty here). Not sure how to really fix it, >>>>>> thought. I think that the problem is more theoretical than real, but >>>>> >>>>> Dave >>>>> >>>>>> .... >>>>>> >>>>>> Thanks, Juan. >>>>>> >>>>>>> >>>>>>> Dave >>>>>>> >>>>>>> -- >>>>>>> Dr. David Alan Gilbert / [email protected] / Manchester, UK >>>>> -- >>>>> Dr. David Alan Gilbert / [email protected] / Manchester, UK >>>>> >>>>> . >>>>> >>>> >>> -- >>> Dr. David Alan Gilbert / [email protected] / Manchester, UK >>> . >>> >> > -- > Dr. David Alan Gilbert / [email protected] / Manchester, UK > . >
