On Tue, Feb 24, 2026 at 12:09:37AM +0800, Shawn Lin wrote: > 在 2026/02/23 星期一 23:50, Andy Shevchenko 写道: > > On Mon, Feb 23, 2026 at 5:32 PM Shawn Lin <[email protected]> wrote: > > > > > > This patch series addresses a long-standing design issue in the PCI/MSI > > > subsystem where the implicit, automatic management of IRQ vectors by > > > the devres framework conflicts with explicit driver cleanup, creating > > > ambiguity and potential resource management bugs. > > > > > > ==== The Problem: Implicit vs. Explicit Management ==== > > > Historically, `pcim_enable_device()` not only manages standard PCI > > > resources > > > (BARs) via devres but also implicitly triggers automatic IRQ vector > > > management > > > by setting a flag that registers `pcim_msi_release()` as a cleanup action. > > > > > > This creates an ambiguous ownership model. Many drivers follow a pattern > > > of: > > > 1. Calling `pci_alloc_irq_vectors()` to allocate interrupts. > > > 2. Also calling `pci_free_irq_vectors()` in their error paths or remove > > > routines. > > > > > > When such a driver also uses `pcim_enable_device()`, the devres framework > > > may > > > attempt to free the IRQ vectors a second time upon device release, > > > leading to > > > a double-free. Analysis of the tree shows this hazardous pattern exists > > > widely, > > > while 35 other drivers correctly rely solely on the implicit cleanup. > > > > Is this confirmed? What I read from the cover letter, this series was > > only compile-tested, so how can you prove the problem exists in the > > first place? > > Yes, it's confirmed. My debug of a double free issue of a out-of-tree > PCIe wifi driver which uses > pcim_enable_device + pci_alloc_irq_vectors + pci_free_irq_vectors expose > it. And we did have a TODO to cleanup this hybrid usage, targeted in > this cycle[1] suggested by Philipp:
Okay, fair enough. I think this bit was missing in the cover letter. > [1] https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=msi > > > ==== The Solution: Making Management Explicit ==== > > > This series enforces a clear, predictable model: > > > 1. New Managed API (Patch 1/37): Introduces pcim_alloc_irq_vectors() and > > > pcim_alloc_irq_vectors_affinity(). Drivers that desire > > > devres-managed IRQ > > > vectors should use these functions, which set the is_msi_managed > > > flag and > > > ensure automatic cleanup. > > > 2. Patches 2 through 36 convert each driver that uses > > > pcim_enable_device() alongside > > > pci_alloc_irq_vectors() and relies on devres for IRQ vector cleanup > > > to instead > > > make an explicit call to pcim_alloc_irq_vectors(). > > > 3. Core Change (Patch 37/37): With the former cleanup, now modifies > > > pcim_setup_msi_release() > > > to check only the is_msi_managed flag. This decouples automatic IRQ > > > cleanup from > > > pcim_enable_device(). IRQ vectors allocated via > > > pci_alloc_irq_vectors*() > > > are now solely the driver's responsibility to free with > > > pci_free_irq_vectors(). > > > > > > With these changes, we clear ownership model: Explicit resource > > > management eliminates > > > ambiguity and follows the "principle of least surprise." New drivers > > > choose one model and > > > be consistent. > > > - Use `pci_alloc_irq_vectors()` + `pci_free_irq_vectors()` for explicit > > > control. > > > - Use `pcim_alloc_irq_vectors()` for devres-managed, automatic cleanup. > > > > Have you checked previous attempts? Why is your series better than those? > > There seems not previous attempts. Maybe we are looking to the different projects... https://lore.kernel.org/all/?q=pcim_alloc_irq_vectors > > > ==== Testing And Review ==== > > > 1. This series is only compiled test with allmodconfig. > > > 2. Given the substantial size of this patch series, I have structured the > > > mailing > > > to facilitate efficient review. The cover letter, the first patch and > > > the last one will be sent > > > to all relevant mailing lists and key maintainers to ensure broad > > > visibility and > > > initial feedback on the overall approach. The remaining > > > subsystem-specific patches > > > will be sent only to the respective subsystem maintainers and their > > > associated > > > mailing lists, reducing noise. -- With Best Regards, Andy Shevchenko
