Hi Daniel,
On 2026-01-27 14:25, Daniel P. Berrangé wrote:
> On Tue, Jan 27, 2026 at 03:03:10PM +0100, Juraj Marcin wrote:
> > From: Juraj Marcin <[email protected]>
> >
> > During migration switchover both the source and the destination machines
> > are paused (compute downtime). During this period network still routes
> > network packets to the source machine, as this is the last place where
> > the recipient MAC address has been seen. Once the destination side
> > starts and sends network announcement, all subsequent frames are routed
> > correctly. However, frames delivered to the source machine are never
> > processed and lost. This causes also a network downtime with roughly the
> > same duration as compute downtime.
> >
> > This can cause problems not only for protocols that cannot handle packet
> > loss, but can also introduce delays in protocols that can handle them.
> >
> > To resolve this, this feature instantiates a network filter for each
> > network backend present during migration setup on both migration sides.
> > On the source side, this filter caches all packets received from the
> > backend during switchover. Once the destination machine starts, all
> > cached packets are sent through the migration channel and the respective
> > filter object on the destination side injects them to the NIC attached
> > to the backend.
>
> If the dest QEMU has started, I presume this means that the ARP
> announcement has been sent ? IOW, the packets being forwarded
> over the migration stream are guaranteed to be delivered "out
> of order" wrt the sender. Should be safe for TCP, but may have
> an impact on other protocols. Though apps should be aware of
> that risk in general, they may not frequently encounter it, and
> it could still cause service disruption
Yes, after ARP announcement from dest. Forwarded packets could get
delivered out-of-order, although it would depend on the traffic rate, in
my testing I encountered out-of-order packets only a couple of times. As
is, this feature allows choosing between risk of packet loss or out of
order delivery, both of which could also happen outside the migration
scope.
I could also update it and defer the delivery of new packets on the
destination until packets from the source side are processed as Michael
suggested, that should prevent out of order delivery.
>
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index f925e5541b..d637b22c80 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -520,6 +520,11 @@
> > # each RAM page. Requires a migration URI that supports seeking,
> > # such as a file. (since 9.0)
> > #
> > +# @netpass: Collect packets received by network backedns after source
> > +# VM is paused and send them to the destination once it resumes.
> > +# This (almost) completely eliminates packet loss caused by
> > +# switchover. (since 11.0)
>
> Should mention they will be deliver "out of order"
>
> > +#
> > # Features:
> > #
> > # @unstable: Members @x-colo and @x-ignore-shared are experimental.
> > @@ -536,7 +541,7 @@
> > { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
> > 'validate-uuid', 'background-snapshot',
> > 'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
> > - 'dirty-limit', 'mapped-ram'] }
> > + 'dirty-limit', 'mapped-ram', 'netpass'] }
> >
> > ##
> > # @MigrationCapabilityStatus:
> > --
> > 2.52.0
> >
> >
>
> With regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
>