On Tue, 11 Jul 2017 15:32:18 +1000 Alexey Kardashevskiy <a...@ozlabs.ru> wrote:
> At the moment VFIO PCI device initialization works as follows: > vfio_realize > vfio_get_group > vfio_connect_container > register memory listeners (1) > update QEMU groups lists > vfio_kvm_device_add_group > > Then (example for pseries) the machine reset hook triggers region_add() > for all regions where listeners from (1) are listening: > > ppc_spapr_reset > spapr_phb_reset > spapr_tce_table_enable > memory_region_add_subregion > vfio_listener_region_add > vfio_spapr_create_window > > This scheme works fine until we need to handle VFIO PCI device hotplug > and we want to enable PPC64/sPAPR in-kernel TCE acceleration on, > i.e. after PCI hotplug we need a place to call > ioctl(vfio_kvm_device_fd, KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE). > Since the ioctl needs a LIOBN fd (from sPAPRTCETable) and a IOMMU group fd > (from VFIOGroup), vfio_listener_region_add() seems to be the only place > for this ioctl(). > > However this only works during boot time because the machine reset > happens strictly after all devices are finalized. When hotplug happens, > vfio_listener_region_add() is called when a memory listener is registered > but when this happens: > 1. new group is not added to the container->group_list yet; > 2. VFIO KVM device is unaware of the new IOMMU group. > > This moves bits around to have all necessary VFIO infrastructure > in place for both initial startup and hotplug cases. Wow, for a fairly straight forward patch this changelog is brutal. Can this be summarized as "Register vfio groups with kvm prior to memory listener registration such that kvm-vfio pseudo device ioctls are available during the region_add callback"? If so I'll queue this up for a pull request prior to soft freeze. Thanks, Alex > Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > Reviewed-by: David Gibson <da...@gibson.dropbear.id.au> > --- > > > This is an independend part of a bigger patchset to enable in-kernel > acceleration of TCE operations. While I stuck with IOMMU MR QOM-fication, > let's try to parallel patchset pieces' review/acceptance. > > --- > Changes: > v3: > * rebased on current upstream > > v2: > * moved container->initialized back to its correct location > * added missing QLIST_REMOVE() > --- > hw/vfio/common.c | 19 +++++++++++-------- > 1 file changed, 11 insertions(+), 8 deletions(-) > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c > index 2965b68e5d..4e75dc8c56 100644 > --- a/hw/vfio/common.c > +++ b/hw/vfio/common.c > @@ -1107,6 +1107,14 @@ static int vfio_connect_container(VFIOGroup *group, > AddressSpace *as, > goto free_container_exit; > } > > + vfio_kvm_device_add_group(group); > + > + QLIST_INIT(&container->group_list); > + QLIST_INSERT_HEAD(&space->containers, container, next); > + > + group->container = container; > + QLIST_INSERT_HEAD(&container->group_list, group, container_next); > + > container->listener = vfio_memory_listener; > > memory_listener_register(&container->listener, container->space->as); > @@ -1120,14 +1128,11 @@ static int vfio_connect_container(VFIOGroup *group, > AddressSpace *as, > > container->initialized = true; > > - QLIST_INIT(&container->group_list); > - QLIST_INSERT_HEAD(&space->containers, container, next); > - > - group->container = container; > - QLIST_INSERT_HEAD(&container->group_list, group, container_next); > - > return 0; > listener_release_exit: > + QLIST_REMOVE(group, container_next); > + QLIST_REMOVE(container, next); > + vfio_kvm_device_del_group(group); > vfio_listener_release(container); > > free_container_exit: > @@ -1232,8 +1237,6 @@ VFIOGroup *vfio_get_group(int groupid, AddressSpace > *as, Error **errp) > > QLIST_INSERT_HEAD(&vfio_group_list, group, next); > > - vfio_kvm_device_add_group(group); > - > return group; > > close_fd_exit: