On Wed, Sep 13, 2023 at 06:53:20PM -0300, Fabiano Rosas wrote:
> Peter Xu <[email protected]> writes:
>
> > On Mon, Sep 11, 2023 at 02:13:19PM -0300, Fabiano Rosas wrote:
> >> The core yank code is strict about balanced registering and
> >> unregistering of yank functions.
> >>
> >> This creates a difficulty because the migration code registers one
> >> yank function per QIOChannel, but each QIOChannel can be referenced by
> >> more than one QEMUFile. The yank function should not be removed until
> >> all QEMUFiles have been closed.
> >>
> >> Keep a reference count of how many QEMUFiles are using a QIOChannel
> >> that has a yank function. Only unregister the yank function when all
> >> QEMUFiles have been closed.
> >>
> >> This improves the current code by removing the need for the programmer
> >> to know which QEMUFile is the last one to be cleaned up and fixes the
> >> theoretical issue of removing the yank function while another QEMUFile
> >> could still be using the ioc and require a yank.
> >>
> >> Signed-off-by: Fabiano Rosas <[email protected]>
> >> ---
> >> migration/yank_functions.c | 81 ++++++++++++++++++++++++++++++++++----
> >> migration/yank_functions.h | 8 ++++
> >> 2 files changed, 81 insertions(+), 8 deletions(-)
> >
> > I worry this over-complicate things.
>
> It does. We ran out of simple options.
>
> > If you prefer the cleaness that we operate always on qemufile level, can we
> > just register each yank function per-qemufile?
>
> "just" hehe
>
> we could, but:
>
> i) the yank is a per-channel operation, so this is even more unintuitive;
I mean we can provide something like:
void migration_yank_qemufile(void *opaque)
{
QEMUFile *file = opaque;
QIOChannel *ioc = file->ioc;
qio_channel_shutdown(ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
}
void migration_qemufile_register_yank(QEMUFile *file)
{
if (migration_ioc_yank_supported(file->ioc)) {
yank_register_function(MIGRATION_YANK_INSTANCE,
migration_yank_qemufile,
file);
}
}
>
> ii) multifd doesn't have a QEMUFile, so it will have to continue using
> the ioc;
We can keep using migration_ioc_[un]register_yank() for them if there's no
qemufile attached. As long as the function will all be registered under
MIGRATION_YANK_INSTANCE we should be fine having different yank func.
>
> iii) we'll have to add a yank to every new QEMUFile created during the
> incoming migration (colo, rdma, etc), otherwise the incoming side
> will be left using iocs while the src uses the QEMUFile;
For RDMA, IIUC it'll simply be a noop as migration_ioc_yank_supported()
will be a noop for it for either reg/unreg.
Currently it seems we will also unreg the ioc even for RDMA (even though we
don't reg for it). But since unreg will be a noop it seems all fine even
if not paired.. maybe we should still try to pair it, e.g. register also in
rdma_start_outgoing_migration() for the rdma ioc so at least they're paired.
I don't see why COLO is special here, though. Maybe I missed something.
>
> iv) this is a functional change of the yank feature for which we have no
> tests.
Having yank tested should be preferrable. Lukas is in the loop, let's see
whether he has something. We can still smoke test it before a selftest
being there.
Taking one step back.. I doubt whether anyone is using yank for migration?
Knowing that migration already have migrate-cancel (for precopy) and
migrate-pause (for postcopy). I never used it myself, and I don't think
it's supported for RHEL. How's that in suse's case?
If no one is using it, maybe we can even avoid registering migration to
yank?
>
> If that's all ok to you I'll go ahead and git it a try.
>
> > I think qmp yank will simply fail the 2nd call on the qemufile if the
> > iochannel is shared with the other one, but that's totally fine, IMHO.
> >
> > What do you think?
> >
> > In all cases, we should probably at least merge patch 1-8 if that can
> > resolve the CI issue. I think all of them are properly reviewed.
>
> I agree. Someone needs to queue this though since Juan has been busy.
Yes, I'll see what I can do.
--
Peter Xu