On 05/09/2024 21:31, Peter Xu wrote:
External email: Use caution opening links or attachments
On Thu, Sep 05, 2024 at 07:45:43PM +0300, Avihai Horon wrote:
Does it also mean then that the currently reported stop-size - precopy-size
will be very close to the constant non-iterable data size?
It's not constant, while the VM is running it can change.
I wonder how heavy is VFIO_DEVICE_FEATURE_MIG_DATA_SIZE ioctl.
I just gave it a quick shot with a busy VM migrating and estimate() is
invoked only every ~100ms.
VFIO might be different, but I wonder whether we can fetch stop-size in
estimate() somehow, so it's still a pretty fast estimate() meanwhile we
avoid the rest of exact() calls (which are destined to be useless without
VFIO).
IIUC so far the estimate()/exact() was because ram sync is heavy when
exact(). When idle it's 80+ms now for 32G VM with current master (which
has a bug and I'm fixing it up [1]..), even if after the fix it's 3ms (I
think both numbers contain dirty bitmap sync for both vfio and kvm). So in
that case maybe we can still try fetching stop-size only for both
estimate() and exact(), but only sync bitmap in exact().
IIUC, the end goal is to prevent migration thread spinning uselessly in
pre-copy in such scenarios, right?
If eventually we do call get stop-copy-size in estimate(), we will move
the spinning from "exact() -> estimate() -> exact() -> estimate() ..."
to "estimate() -> estimate() -> ...".
If so, what benefit would we get from this? We only move the useless
work to other place.
Shouldn't we directly go for the non precopy-able vs precopy-able report
that you suggested?
Thanks.