> From: Qemu-devel <qemu-devel- > [email protected]> On Behalf Of Jason Gunthorpe > Sent: Tuesday, 18 October 2022 15:23 > To: Joao Martins <[email protected]> > Cc: [email protected]; Alex Williamson <[email protected]>; > Eric Blake <[email protected]>; Stefan Hajnoczi <[email protected]>; > Fam Zheng <[email protected]>; [email protected]; Cornelia Huck > <[email protected]>; Thomas Huth <[email protected]>; Vladimir > Sementsov-Ogievskiy <[email protected]>; Laurent Vivier > <[email protected]>; John Snow <[email protected]>; Dr. David Alan > Gilbert <[email protected]>; Christian Borntraeger > <[email protected]>; Halil Pasic <[email protected]>; Paolo > Bonzini <[email protected]>; [email protected]; Eric Farman > <[email protected]>; Richard Henderson > <[email protected]>; David Hildenbrand <[email protected]>; > Avihai Horon <[email protected]>; [email protected] > Subject: Re: [RFC 7/7] migration: call qemu_savevm_state_pending_exact() > with the guest stopped > > On Fri, Oct 14, 2022 at 01:29:51PM +0100, Joao Martins wrote: > > On 14/10/2022 12:28, Juan Quintela wrote: > > > Joao Martins <[email protected]> wrote: > > >> On 13/10/2022 17:08, Juan Quintela wrote: > > >>> Oops. My understanding was that once the guest is stopped you can > > >>> say how big is it. > > > > > > Hi > > > > > >> It's worth keeping in mind that conceptually a VF won't stop (e.g. > > >> DMA) until really told through VFIO. So, stopping CPUs (guest) just > > >> means that guest code does not arm the VF for more I/O but still is > > >> a weak guarantee as VF still has outstanding I/O to deal with until > > >> VFIO tells it to quiesce DMA (for devices that support it). > > > > > > How can we make sure about that? > > > > > > i.e. I know I have a vfio device. I need two things: > > > - in the iterative stage, I eed to check the size, but a estimate is ok. > > > for example with RAM, we use whatever is the size of the dirty bitmap > > > at that moment. > > > If the estimated size is smaller than the theshold, we > > > * stop the guest > > > * sync dirty bitmap > > > * now we test the (real) dirty bitmap size > > > > > > How can we do something like that with a vfio device. > > > > > You would have an extra intermediate step that stops the VF prior to > > asking the device state length. What I am not sure is whether stopping > > vCPUs can be skipped as an optimization. > > It cannot, if you want to stop the VFIO device you must also stop the vCPUs > because the device is not required to respond properly to MMIO operations > when stopped. > > > > My understanding from NVidia folks was that newer firmware have an > > > ioctl to return than information. > > > > Maybe there's something new. I was thinking from this here: > > Juan is talking about the ioctl we had in the pre-copy series. > > I expect it to come into some different form to support this RFC. >
Do we really need to STOP the VM to get the exact data length that will be required to complete stop copy ? Can't we simply go with some close estimation when the device is running and drop all the complexity in QEMU/Kernel to STOP and then RE-START the VM if the threshold didn't meet, etc.? Yishai
