Ilpo Järvinen <[email protected]> writes: > Hi all, > > Thanks to issue reports from Simon Richter and Alex Bennée, I > discovered BAR resize rollback can corrupt the resource tree. As fixing > corruption requires avoiding overlapping resource assignments, the > correct fix can unfortunately results in worse user experience, what > appeared to be "working" previously might no longer do so. Thus, I had > to do a larger rework to pci_resize_resource() in order to properly > restore resource states as it was prior to BAR resize. > > This rework has been on my TODO list anyway but it wasn't the highest > prio item until pci_resize_resource() started to cause regressions due > to other resource assignment algorithm changes.
Thanks I'll have a look. Where does this apply? At least v6.17 doesn't seem to have pbus_reassign_bridge_resources which 4/11 is trying to tweak. > > BAR resize rollback does not always restore BAR resources as they were > before the resize operation was started. Currently, when > pci_resize_resource() call is made by a driver, the driver must release > device resource prior to the call. This is a design flaw in > pci_resize_resource() API as PCI core cannot then save the state of > those resources from what it was prior to release so it could restore > them later if the BAR size change has to be rolled back. > > PCI core's BAR resize operation doesn't even attempt to restore the > device resources currently when rolling back BAR resize operation. If > the normal resource assignment algorithm assigned those resources, then > device resources might be assigned after pci_resize_resource() call but > that could also trigger the resource tree corruption issue so what > appeared to an user as "working" might be a corrupted state. > > With the new pci_resize_resource() interface, the driver calling > pci_resize_resource() should no longer release the device resources. > > I've added WARN_ON_ONCE() to pick up similar bugs that cause resource > tree corruption. At least in my tests all looked clear on that front > after this series. > > It would still be nice if the reporters could test these changes > resolve the claim conflicts (while I've tested the series to some extent, > I don't have such conflicts here). > > This series will likely conflict with some drm changes from Lucas (will > make them partially obsolete by removing the need to release dev's > resources on the driver side). > > I'll soon submit refresh of pci/rebar series on top of this series as > there are some conflicts with them. > > v2: > - Add exclude_bars parameter to pci_resize_resource() > - Add Link tags > - Add kerneldoc patch > - Add patch to release pci_bus_sem earlier. > - Fix to uninitialized var warnings. > - Don't use guard() as goto from before it triggers error with clang. > > Ilpo Järvinen (11): > PCI: Prevent resource tree corruption when BAR resize fails > PCI/IOV: Adjust ->barsz[] when changing BAR size > PCI: Change pci_dev variable from 'bridge' to 'dev' > PCI: Try BAR resize even when no window was released > PCI: Freeing saved list does not require holding pci_bus_sem > PCI: Fix restoring BARs on BAR resize rollback path > PCI: Add kerneldoc for pci_resize_resource() > drm/xe: Remove driver side BAR release before resize > drm/i915: Remove driver side BAR release before resize > drm/amdgpu: Remove driver side BAR release before resize > PCI: Prevent restoring assigned resources > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +- > drivers/gpu/drm/i915/gt/intel_region_lmem.c | 14 +-- > drivers/gpu/drm/xe/xe_vram.c | 5 +- > drivers/pci/iov.c | 15 +-- > drivers/pci/pci-sysfs.c | 17 +-- > drivers/pci/pci.c | 4 + > drivers/pci/pci.h | 9 +- > drivers/pci/setup-bus.c | 126 ++++++++++++++------ > drivers/pci/setup-res.c | 52 ++++---- > include/linux/pci.h | 3 +- > 10 files changed, 142 insertions(+), 113 deletions(-) > > > base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787 -- Alex Bennée Virtualisation Tech Lead @ Linaro
