On 2.07.2025 08:27, Cédric Le Goater wrote:
Adding more maintainers,

+Eric (ARM smmu),
+Peter (ARM, GIC, virt),

On 6/24/25 19:51, Maciej S. Szmigiero wrote:
From: "Maciej S. Szmigiero" <maciej.szmigi...@oracle.com>

This property allows configuring whether to start the config load only
after all iterables were loaded.
Such interlocking is required for ARM64 due to this platform VFIO
dependency on interrupt controller being loaded first.

Could you please a bit more ?

Any proposals what more you'd want to have written there?

Do you want to have the description of the issue being fixed from commit
d329f5032e17 copied into this fix commit message?

The property defaults to AUTO, which means ON for ARM, OFF for other
platforms.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigi...@oracle.com>

As we've mentioned a couple of times, this is essentially a workaround
to help ARM support a migration optimization (multifd) for guests using
passthrough PCI devices. At the moment, this mainly for MLX5 VFs
(upstream) and NVIDIA vGPUs (not upstream).

It looks like the issue is related to the ordering of the vmstate during
load time.

Is there a different way we could address this ? Other virt machines like
x86 and ppc also deal with complex interrupt controllers, and those cases
have been handled cleanly AFAICT. So what’s the fundamental issue here that
makes it necessary to add more complexity to an already complex feature
(multif VFIO migration) for what seems to be a corner case on a single
architecture ?

d329f5032e17 is the turning point.
That commit says that restoring VFIO devices config space on an ARM target
needs to have the VGIC interrupt controller there already fully loaded,
otherwise that VFIO config space load operation may error out.

But here it would be good to have some feedback from ARM people, thanks for
CCing them.

I'm open to reasonable alternative proposals - as long as they can be tested,
preferably on other platforms.

The reason I implemented this ARM device loading ordering requirement this way
is that I can then test it on the VFIO setup I have and be reasonably certain
that it will work on the target platform too.

Using 'strcmp(target_name(), "aarch64")' is quite unique in the QEMU code
base, and to be honest, I’m not too keen on adding it unless there’s really
no other option.
The previous versions simply tested the TARGET_ARM macro but commit
5731baee6c3c ("hw/vfio: Compile some common objects once") made the
migration-multifd.c file target-independent so it cannot use target-specific
macros now.

Another option would be to move vfio_load_config_after_iter() to helpers.c
since that file is target-dependent and can simply test TARGET_ARM
macro (#if defined(TARGET_ARM)) instead of doing strcmp(target_name(), 
"aarch64")
which I agree looks weird.

Thanks,

C.



Thanks,
Maciej


Reply via email to