On Wed, 28 Jan 2026 14:06:11 +0100
Juraj Marcin <[email protected]> wrote:

> Hi Stefano,
> 
> On 2026-01-27 19:21, Stefano Brivio wrote:
> > [Cc'ing Laurent and David]
> > 
> > On Tue, 27 Jan 2026 15:03:06 +0100
> > Juraj Marcin <[email protected]> wrote:
> >   
> > > During switchover there is a period during which both source and
> > > destination side VMs are paused. During this period, all network packets
> > > are still routed to the source side, but it will never process them.
> > > Once the destination resumes, it is not aware of these packets and they
> > > are lost. This can cause packet loss in unreliable protocols and
> > > extended delays due to retransmission in reliable protocols.
> > > 
> > > This series resolves this problem by caching packets received once the
> > > source VM pauses and then passing and injecting them on the destination
> > > side. This feature is implemented in the last patch. The caching and
> > > injecting is implemented using network filter interface and should work
> > > with any backend with vhost=off, but only TAP network backend was
> > > explicitly tested.  
> > 
> > I haven't had a chance to try this change with passt(1) yet (the
> > backend can be enabled using "-net passt" or by starting it
> > separately).
> > 
> > Given that passt implements migration on its own (in deeper detail in
> > some sense, as TCP connections are preserved if IP addresses match), I
> > wonder if it this might affect or break it somehow.
> > 
> > Did you perhaps have some thoughts about that already?  
> 
> I'm aware of passt migrating its state and passt-repair, but I also
> haven't tested it as I couldn't get passt-repair to work.

Oops. Let me know if you're hitting any specific error I could look
into.

I plan anyway to try out your changes but I might need a couple of days
before I find the time.

> Does it also handle other protocols, or just preserves TCP connections?

Layer-4-wise, we have an internal representation of UDP "flows"
(observed flows of packets for which we preserve the same source port
mapping, with timeouts) and we had a vague idea of migrating those as
well, but it's debatable where there's any benefit from it.

At Layer 2 and 3, we migrate IP and MAC addresses we observed from the
guest:

  
https://passt.top/passt/tree/migrate.c?id=e3f70c05bad90368a1a89bf31a9015125232b9ae#n31

so that we have ARP and NDP resolution, as well as any NAT
mapping working right away as needed.

For completeness, this is the TCP context we migrate instead:

  
https://passt.top/passt/tree/tcp_conn.h?id=e3f70c05bad90368a1a89bf31a9015125232b9ae#n108
  
https://passt.top/passt/tree/tcp_conn.h?id=e3f70c05bad90368a1a89bf31a9015125232b9ae#n154

> The main focus of this feature are protocols that cannot handle packet
> loss on their own in environments where IP address is preserved (and
> thus also TCP connections).

Well, strictly speaking, TCP handles packet loss, that's actually the
main reason behind it. I guess this is to improve throughput and avoid
latency spikes or retransmissions that could be avoided?

> So, mainly tap/bridge, with the idea that
> other network backends could also benefit from it. However, if it causes
> problems with other backends, I could limit it just to tap.

I couldn't quite figure out yet if it's beneficial, useless, or
harmless for passt. With passt, what happens without your
implementation is:

1. guest pauses

2. the source instance of passt starts migrating, meaning that sockets
   are frozen one by one, their receiving and sending queues dumped

3. pending queues are sent to the target instance of passt, which opens
   sockets as refills queues as needed

4. target guest resumes and will get any traffic that was received by
   the source instance of passt between 1. and 2.

Right now there's still a Linux kernel issue we observed (see also
https://pad.passt.top/p/TcpRepairTodo, that's line 4 there) which might
cause segments to be received (and acknowledged!) on sockets of the
source instance of passt for a small time period *after* we freeze them
with TCP_REPAIR (that is, TCP_REPAIR doesn't really freeze the queue).

I'm currently working on a proper fix for that. Until then, point 2.
above isn't entirely accurate (but it only happens if you hammer it
with traffic generators, it's not really visible otherwise).

With your implementation, I guess:

1. guest pauses

2. the source instance of passt starts migrating, meaning that sockets
   are frozen one by one, their receiving and sending queues dumped

2a. any data received by QEMU after 1. will be stored and forwarded to
    the target later. But passt at this point prevents the guest from
    getting any data, so there should be no data involved

3. pending queues are sent to the target instance of passt, which opens
   sockets as refills queues as needed

3a. the target guest gets the data from 2a. As long as there's no data
    (as I'm assuming), there should be no change. If there's data coming
    in at this point, we risk that sequences don't match anymore? I'm not
    sure

4. target guest resumes and will *also* get any traffic that was received
   by the source instance of passt between 1. and 2.

So if my assumption from 2a. above holds, it should be useless, but
harmless.

Would your implementation help with the kernel glitch we're currently
observing? I don't think so, because your implementation would only play
a role between passt and QEMU, and we don't have issues there.

Well, it would be good to try things out. Other than that, unless I'm
missing something, your implementation should probably be skipped for
passt for simplicity, and also to avoid negatively affecting downtime.

Note that you can also use passt without "-net passt" (that's actually
quite recent) but with a tap back-end. Migration is only supported with
vhost-user enabled though, and as far as I understand your implementation
is disabled in that case?

-- 
Stefano


Reply via email to