On Wed, Nov 05, 2025 at 10:18:01AM -0800, Nicolin Chen wrote: > On Wed, Nov 05, 2025 at 06:57:31AM +0000, Tian, Kevin wrote: > > > From: Jason Gunthorpe <[email protected]> > > > Sent: Tuesday, November 4, 2025 2:54 AM > > > > > > On Thu, Oct 30, 2025 at 12:43:59PM -0700, Nicolin Chen wrote: > > > > > > > FWIW, I am thinking of another design based on Jason's remarks: > > > > https://lore.kernel.org/linux-iommu/aQBopHFub8wyQh5C@Asurada- > > > Nvidia/ > > > > > > > > So, instead of core initiating the round trip between the blocking > > > > domain and group->domain, it forwards dev_reset_prepare/done to the > > > > driver where it does a low-level attachment that wouldn't fail: > > > > For SMMUv3, it's an STE update. > > > > For intel_iommu, it seems to be the context table update? > > > > > > Kevin, how bad do you think the UAPI issue is if we ignore it? > > > > > > > yeah probably better to leave it. I didn't see a clean way and the > > value didn't justify the complexity. > > > > Regarding to PF reset, it's a devastating operation while the vf user > > is operating the vf w/o any awareness. there must be certain > > coordination in userspace. otherwise nobody can recover the > > registers. Comparing to that, solving the domain attach problem > > is less important... > > If I capture these correctly, we should go with a -EBUSY version: > - Reject concurrent attachments during a device reset > - Skip reset for devices having sibling group devices > - Allow PF to stop IOMMU, ignoring VFs > ? > > That sounds pretty much like this v4: > https://lore.kernel.org/linux-iommu/0f6021b500c74db33af8118210dd7a2b2fd31b3c.1756682135.git.nicol...@nvidia.com/ > by dropping the SRIOV concern.
It seems like the simplest answer.. I'd ignore the VFs, I think it is already really weird/dangerous to be resetting the PF while VFs have drivers bound.. Not sure there is anything we can do to make this work better. Jason

