* Markus Armbruster ([email protected]) wrote: > TL;DR: I recommend to stay away from migration when using chardev=... > > ivshmem migration is messed up in several entertaining ways. > > = General lossage = > > G1. Migrating more than one peer doesn't work, but that's a (badly) > documented restriction, not a bug (see documentation of property > "role" in qemu-doc.texi). If you migrate more than one, the shared > memory can get messed up. > > G2. If peers connect on the destination before migration is complete, > the shared memory can get messed up. This isn't even badly > documented. > > Management applications can deal with this in principle. > > = Lossage with MSI-X (msi=on) = > > M1. s->intrstatus and s->intrmask (registers INTRSTATUS and INTRMASK) > are not migrated, even though they have guest-visible contents. > They reset to zero instead. Wrong, but unlikely to cause trouble, > because the registers are inert in this configuration. > > There's nothing management applications can do about this. > > = Lossage with interrupts (chardev=...) = > > I1. s->vm_id (register IVPOSITION) is not migrated. It briefly changes > to -1, then to whatever ID the server on the destination assigns. > To get the same ID back, you must carefully control the order in > which devices connect to the server on the destination: if this > device was the n-th to connect on the source, it must also be the > n-th on the destination. > > We can hope that the guest reads IVPOSITION rarely or not at all > after device driver initialization, so the temporary change to -1 > will be overlooked most of the time. > > I2. If the shared memory's ramblock arrives at the destination before > shared memory setup completes, migration fails. Shared memory setup > completes shortly after the shared memory is received from the > server. > > I3. If migration completes before the shared memory setup completes on > the source, shared memory contents is lost (zeroed?). I don't yet > know what happens when shared memory setup completes during > migration. > > G2 + I1 implies that you can only migrate the peer with ID zero. > Management applications need make sure the device with role=master > connects first both on source and destination, which seems feasible. > > There's nothing management applications can do about the temporary > IVPOSITION change (I1). > > There is no known way for a management application to wait for shared > memory setup to complete. > > Migration failure due to I2 is recoverable: restart the server on the > destination, and retry the migration with a bit more time between > running the destination QEMU and the migrate command. The server > restart is necessary to preserve ID zero. > > I'm not aware of a way to guard against or mitigate I3. Fortunately, > shared memory setup should almost always win the race. > > = What can we do about it? = > > G1 and G2 are a matter of improving documentation. > > M1 is easy enough to fix, if we care. > > That leaves I1, I2 and I3. Common root cause: we don't finish setup in > realize(), we merely arrange for messages from the server to be received > and processed. This exposes both guest and migration to an incompletely > set up device. > > Completing setup right in realize() would be simpler and race-free. > However, it could also make realize() hang waiting for a hung server. > Probably okay for -device, but what about hot plug? > > If it's not okay, we could split ivshmem into a frontend and a backend. > Hot plug could create the backend asynchronously, wait for it to > complete, then create the frontend / device model. Command line would > have to create the backend synchronously, of course.
How can you tell when 'shared memory setup' is complete? You could delay starting incoming migration on the destination or starting a migration on the source until that setup is complete. Dave > > Other ideas? > -- Dr. David Alan Gilbert / [email protected] / Manchester, UK
