On Thu, Jul 24, 2025 at 10:45:34AM -0400, Jonah Palmer wrote: > > > On 7/23/25 2:51 AM, Michael S. Tsirkin wrote: > > On Tue, Jul 22, 2025 at 12:41:25PM +0000, Jonah Palmer wrote: > > > Lays out the initial groundwork for iteratively migrating the state of a > > > virtio-net device, starting with its vmstate (via vmstate_save_state & > > > vmstate_load_state). > > > > > > The original non-iterative vmstate framework still runs during the > > > stop-and-copy phase when the guest is paused, which is still necessary > > > to migrate over the final state of the virtqueues once the sourced has > > > been paused. > > > > > > Although the vmstate framework is used twice (once during the iterative > > > portion and once during the stop-and-copy phase), it appears that > > > there's some modest improvement in guest-visible downtime when using a > > > virtio-net device. > > > > > > When tracing the vmstate_downtime_save and vmstate_downtime_load > > > tracepoints, for a virtio-net device using iterative live migration, the > > > non-iterative downtime portion improved modestly, going from ~3.2ms to > > > ~1.4ms: > > > > > > Before: > > > ------- > > > vmstate_downtime_load type=non-iterable idstr=0000:00:03.0/virtio-net > > > instance_id=0 downtime=3594 > > > > > > After: > > > ------ > > > vmstate_downtime_load type=non-iterable idstr=0000:00:03.0/virtio-net > > > instance_id=0 downtime=1607 > > > > > > This improvement is likely due to the initial vmstate_load_state call > > > (while the guest is still running) "warming up" all related pages and > > > structures on the destination. In other words, by the time the final > > > stop-and-copy phase starts, the heavy allocations and page-fault > > > latencies are reduced, making the device re-loads slightly faster and > > > the guest-visible downtime window slightly smaller. > > > > did I get it right it's just the vmstate load for this single device? > > If the theory is right, is it not possible that while the > > tracepoints are now closer together, you have pushed something > > else out of the cache, making the effect on guest visible downtime > > unpredictable? how about the total vmstate load time? > > > > Correct, the data above is just from the virtio-net device's downtime > contribution (specifically during the stop-and-copy phase). > > Theoretically, yes I believe so. To try and get a feel on this, I ran some > slightly heavier testing for the virtio-net device: vhost-net + 4 queue > pairs (the one above was just a virtio-net device with 1 queue pair). > > I traced the reported downtimes of the devices that come right before and > after virtio-net's vmstate_load_state call with and without iterative > migration on the virtio-net device. > > The downtimes below are all from the vmstate_load_state calls that happen > while the source has been stopped: > > With iterative migration for virtio-net: > ---------------------------------------- > vga: 1.50ms | 1.39ms | 1.37ms | 1.50ms | 1.63ms | > virtio-console: 13.78ms | 14.24ms | 13.74ms | 13.89ms | 13.60ms | > virtio-net: 13.91ms | 13.52ms | 13.09ms | 13.59ms | 13.37ms | > virtio-scsi: 18.71ms | 13.96ms | 14.05ms | 16.55ms | 14.30ms | > > vga: Avg. 1.47ms | Var: 0.0109ms² | Std. Dev (σ): 0.104ms > virtio-console: Avg. 13.85ms | Var: 0.0583ms² | Std. Dev (σ): 0.241ms > virtio-net: Avg. 13.49ms | Var: 0.0904ms² | Std. Dev (σ): 0.301ms > virtio-scsi: Avg. 15.51ms | Var: 4.3299ms² | Std. Dev (σ): 2.081ms > > Without iterative migration for virtio-net: > ------------------------------------------- > vga: 1.47ms | 1.28ms | 1.55ms | 1.36ms | 1.22ms | > virtio-console: 13.39ms | 13.40ms | 14.37ms | 13.93ms | 13.36ms | > virtio-net: 18.52ms | 17.77ms | 17.52ms | 15.52ms | 17.32ms | > virtio-scsi: 13.35ms | 13.94ms | 15.17ms | 16.01ms | 14.08ms | > > vga: Avg. 1.37ms | Var: 0.0182ms² | Std. Dev (σ): 0.135ms > virtio-console: Avg. 13.69ms | Var: 0.2007ms² | Std. Dev (σ): 0.448ms > virtio-net: Avg. 17.33ms | Var: 1.2305ms² | Std. Dev (σ): 1.109ms > virtio-scsi: Avg. 14.51ms | Var: 1.1352ms² | Std. Dev (σ): 1.065ms > > The most notable difference here is the standard deviation of virtio-scsi's > migration downtime, which comes after virtio-net's migration: virtio-scsi's > σ rises from ~1.07ms to ~2.08ms when virtio-net is iteratively migrated. > > However, since I only got 5 samples per device, the trend is indicative but > not definitive. > > Total vmstate load time per device ≈ downtimes reported above, unless you're > referring to overall downtime across all devices?
Indeed. I also wonder, if preheating cache is a big gain, why don't we just do it for all devices? there is nothing special in virtio: just call save for all devices, send the state, call load on destination then call reset to discard the state. > ---------- > > Having said all this, this RFC is just an initial, first-step for iterative > migration of a virtio-net device. This second vmstate_load_state call during > the stop-and-copy phase isn't optimal. A future version of this series could > do away with this second call and only send the deltas instead of the entire > state again. I see how this could be a win, in theory, if the state is big. > > > Future patches could improve upon this by skipping the second > > > vmstate_save/load_state calls (during the stop-and-copy phase) and > > > instead only send deltas right before/after the source is stopped. > > > > > > Signed-off-by: Jonah Palmer <jonah.pal...@oracle.com> > > > --- > > > hw/net/virtio-net.c | 37 ++++++++++++++++++++++++++++++++++ > > > include/hw/virtio/virtio-net.h | 8 ++++++++ > > > 2 files changed, 45 insertions(+) > > > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > index 19aa5b5936..86a6fe5b91 100644 > > > --- a/hw/net/virtio-net.c > > > +++ b/hw/net/virtio-net.c > > > @@ -3808,16 +3808,31 @@ static bool virtio_net_is_active(void *opaque) > > > static int virtio_net_save_setup(QEMUFile *f, void *opaque, Error > > > **errp) > > > { > > > + VirtIONet *n = opaque; > > > + > > > + qemu_put_be64(f, VNET_MIG_F_INIT_STATE); > > > + vmstate_save_state(f, &vmstate_virtio_net, n, NULL); > > > + qemu_put_be64(f, VNET_MIG_F_END_DATA); > > > + > > > return 0; > > > } > > > static int virtio_net_save_live_iterate(QEMUFile *f, void *opaque) > > > { > > > + bool new_data = false; > > > + > > > + if (!new_data) { > > > + qemu_put_be64(f, VNET_MIG_F_NO_DATA); > > > + return 1; > > > + } > > > + > > > + qemu_put_be64(f, VNET_MIG_F_END_DATA); > > > return 1; > > > } > > > static int virtio_net_save_live_complete_precopy(QEMUFile *f, void > > > *opaque) > > > { > > > + qemu_put_be64(f, VNET_MIG_F_NO_DATA); > > > return 0; > > > } > > > @@ -3833,6 +3848,28 @@ static int virtio_net_load_setup(QEMUFile *f, void > > > *opaque, Error **errp) > > > static int virtio_net_load_state(QEMUFile *f, void *opaque, int > > > version_id) > > > { > > > + VirtIONet *n = opaque; > > > + uint64_t flag; > > > + > > > + flag = qemu_get_be64(f); > > > + if (flag == VNET_MIG_F_NO_DATA) { > > > + return 0; > > > + } > > > + > > > + while (flag != VNET_MIG_F_END_DATA) { > > > + switch (flag) { > > > + case VNET_MIG_F_INIT_STATE: > > > + { > > > + vmstate_load_state(f, &vmstate_virtio_net, n, > > > VIRTIO_NET_VM_VERSION); > > > + break; > > > + } > > > + default: > > > + qemu_log_mask(LOG_GUEST_ERROR, "%s: Uknown flag 0x%"PRIx64, > > > __func__, flag); > > > + return -EINVAL; > > > + } > > > + > > > + flag = qemu_get_be64(f); > > > + } > > > return 0; > > > } > > > diff --git a/include/hw/virtio/virtio-net.h > > > b/include/hw/virtio/virtio-net.h > > > index b9ea9e824e..d6c7619053 100644 > > > --- a/include/hw/virtio/virtio-net.h > > > +++ b/include/hw/virtio/virtio-net.h > > > @@ -163,6 +163,14 @@ typedef struct VirtIONetQueue { > > > struct VirtIONet *n; > > > } VirtIONetQueue; > > > +/* > > > + * Flags to be used as unique delimiters for virtio-net devices in the > > > + * migration stream. > > > + */ > > > +#define VNET_MIG_F_INIT_STATE (0xffffffffef200000ULL) > > > +#define VNET_MIG_F_END_DATA (0xffffffffef200001ULL) > > > +#define VNET_MIG_F_NO_DATA (0xffffffffef200002ULL) > > > + > > > struct VirtIONet { > > > VirtIODevice parent_obj; > > > uint8_t mac[ETH_ALEN]; > > > -- > > > 2.47.1 > >