> -----Original Message----- > From: [email protected] [mailto:iommu- > [email protected]] On Behalf Of Don Dutile > Sent: Thursday, April 25, 2013 1:11 AM > To: Alex Williamson > Cc: Yoder Stuart-B08248; [email protected] > Subject: Re: RFC: vfio / iommu driver for hardware with no iommu > > On 04/23/2013 03:47 PM, Alex Williamson wrote: > > On Tue, 2013-04-23 at 19:16 +0000, Yoder Stuart-B08248 wrote: > >> > >>> -----Original Message----- > >>> From: Alex Williamson [mailto:[email protected]] > >>> Sent: Tuesday, April 23, 2013 11:56 AM > >>> To: Yoder Stuart-B08248 > >>> Cc: Joerg Roedel; [email protected] > >>> Subject: Re: RFC: vfio / iommu driver for hardware with no iommu > >>> > >>> On Tue, 2013-04-23 at 16:13 +0000, Yoder Stuart-B08248 wrote: > >>>> Joerg/Alex, > >>>> > >>>> We have embedded systems where we use QEMU/KVM and have the > >>>> requirement to do device assignment, but have no iommu. So we > >>>> would like to get vfio-pci working on systems like this. > >>>> > >>>> We're aware of the obvious limitations-- no protection, DMA'able > >>>> memory must be physically contiguous and will have no iova->phy > >>>> translation. But there are use cases where all OSes involved are > >>>> trusted and customers can > >>>> live with those limitations. Virtualization is used > >>>> here not to sandbox untrusted code, but to consolidate multiple > >>>> OSes. > >>>> > >>>> We would like to get your feedback on the rough idea. There are > >>>> two parts-- iommu driver and vfio-pci. > >>>> > >>>> 1. iommu driver > >>>> > >>>> First, we still need device groups created because vfio is based on > >>>> that, so we envision a 'dummy' iommu driver that implements only > >>>> the add/remove device ops. Something like: > >>>> > >>>> static struct iommu_ops fsl_none_ops = { > >>>> .add_device = fsl_none_add_device, > >>>> .remove_device = fsl_none_remove_device, > >>>> }; > >>>> > >>>> int fsl_iommu_none_init() > >>>> { > >>>> int ret = 0; > >>>> > >>>> ret = iommu_init_mempool(); > >>>> if (ret) > >>>> return ret; > >>>> > >>>> bus_set_iommu(&platform_bus_type,&fsl_none_ops); > >>>> bus_set_iommu(&pci_bus_type,&fsl_none_ops); > >>>> > >>>> return ret; > >>>> } > >>>> > >>>> 2. vfio-pci > >>>> > >>>> For vfio-pci, we would ideally like to keep user space mostly > >>>> unchanged. User space will have to follow the semantics of mapping > >>>> only physically contiguous chunks...and iova will equal phys. > >>>> > >>>> So, we propose to implement a new vfio iommu type, called > >>>> VFIO_TYPE_NONE_IOMMU. This implements any needed vfio interfaces, > >>>> but there are no calls to the iommu layer...e.g. map_dma() is a > >>>> noop. > >>>> > >>>> Would like your feedback. > >>> > >>> My first thought is that this really detracts from vfio and iommu > >>> groups being a secure interface, so somehow this needs to be clearly > >>> an insecure mode that requires an opt-in and maybe taints the > >>> kernel. Any notion of unprivileged use needs to be blocked and it > >>> should test CAP_COMPROMISE_KERNEL (or whatever it's called now) at > >>> critical access points. We might even have interfaces exported that > >>> would allow this to be an out-of-tree driver (worth a check). > >>> > >>> I would guess that you would probably want to do all the iommu group > >>> setup from the vfio fake-iommu driver. In other words, that driver > >>> both creates the fake groups and provides the dummy iommu backend for > vfio. > >>> That would be a nice way to compartmentalize this as a > >>> vfio-noiommu-special. > >> > >> So you mean don't implement any of the iommu driver ops at all and > >> keep everything in the vfio layer? > >> > >> Would you still have real iommu groups?...i.e. > >> $ readlink /sys/bus/pci/devices/0000:06:0d.0/iommu_group > >> ../../../../kernel/iommu_groups/26 > >> > >> ...and that is created by vfio-noiommu-special? > > > > I'm suggesting (but haven't checked if it's possible), to implement > > the iommu driver ops as part of the vfio iommu backend driver. The > > primary motivation for this would be to a) keep a fake iommu groups > > interface out of the iommu proper (possibly containing it in an > > external driver) and b) modularizing it so we don't have fake iommu > > groups being created by default. It would have to populate the iommu > > groups sysfs interfaces to be compatible with vfio. > > > >> Right now when the PCI and platform buses are probed, the iommu > >> driver add-device callback gets called and that is where the > >> per-device group gets created. Are you envisioning registering a > >> callback for the PCI bus to do this in vfio-noiommu-special? > > > > Yes. It's just as easy to walk all the devices rather than doing > > callbacks, iirc the group code does this when you register. In fact, > > this noiommu interface may not want to add all devices, we may want to > > be very selective and only add some. > > > Right. > Sounds like a no-iommu driver is needed to leave vfio unaffected, and > still leverage/use vfio for qemu's device assignment. > Just not sure how to 'taint' it as 'not secure' if no-iommu driver put in > place. > > btw -- qemu has the inherent assumption that pci cfg cycles are trapped, > so assigned devices are 'remapped' from system-B:D.F to virt- > machine's > (virtualized) B:D.F of the assigned device. > Are pci-cfg cycles trapped in freescale qemu model ? > The vfio-pci device would be visible (to a KVM guest) as a PCI device on the virtual PCI bus (emulated by qemu).
-Varun _______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
