Hi Niklas,
I just checked on my local git clone of noble master-next:
$ git log --oneline --grep "s390/pci: Allow re-add of a reserved but not yet
removed device"
1c9b1caef9db s390/pci: Allow re-add of a reserved but not yet removed device
$ git tag --contains 1c9b1caef9db
Ubuntu-6.8.0-64.67 <=====
Ubuntu-lowlatency-6.8.0-64.67.1
fheimes@T570:~/ubuntu-noble-master-next/noble-clean$ git show 1c9b1caef9db |
grep -C 3 zpci_event_reappear
zdev->state = ZPCI_FN_STATE_STANDBY;
}
+static void zpci_event_reappear(struct zpci_dev *zdev)
+{
+ lockdep_assert_held(&zdev->state_lock);
+ /*
--
}
} else {
+ if (zdev->state == ZPCI_FN_STATE_RESERVED)
+ zpci_event_reappear(zdev);
/* the configuration request may be stale */
- if (zdev->state != ZPCI_FN_STATE_STANDBY)
+ else if (zdev->state != ZPCI_FN_STATE_STANDBY)
--
}
} else {
+ if (zdev->state == ZPCI_FN_STATE_RESERVED)
+ zpci_event_reappear(zdev);
zpci_update_fh(zdev, ccdf->fh);
}
break;
fheimes@T570:~/ubuntu-noble-master-next/noble-clean$ rmadison
--suite=noble,noble-proposed --arch=s390x linux-generic
linux-generic | 6.8.0-31.31 | noble | s390x
linux-generic | 6.8.0-64.67 | noble-proposed | s390x <=====
So the code is included in: 6.8.0-64.67
On the git web interface I think it's this link:
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/log/?h=master-next&qt=grep&q=s390%2Fpci%3A+Allow+re-add+of+a+reserved+but+not+yet+removed+device
where this is referenced:
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/commit/?h=master-next&id=1c9b1caef9db383b587a19df8c6dcd14eb5f3af2
(so using normal master-next (master-next):
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/log/?h=master-next
instead of:
master-next--2025.06.16-2--auto )
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2114174
Title:
[UBUNTU 24.04] s390/pci: Fix immediate re-add of PCI function after
remove
Status in Ubuntu on IBM z Systems:
Fix Committed
Status in linux package in Ubuntu:
Invalid
Status in linux source package in Noble:
Fix Committed
Status in linux source package in Plucky:
Fix Committed
Bug description:
[ Impact ]
s390/pci: Fix immediate re-add of PCI function after remove
A PCI function may be reserved directly after being
deconfigured. If it subsequently returns back in the standby
state Linux may not be able to use the new instance generating
a kernel warning about trying to create an already existing
sysfs file for the IOMMU.
The problem occurs because the new instance of the same
underlying device is created before the prior instance is
completely torn down. This happens because the lifetime of the
PCI device representation in Linux is determined by reference
counts. A driver, the network stack, or even user-space
(including via vfio-pci) may be holding onto the device
represenation even after the underlying device is gone.
The solution to this is twofold. Firstly allow re-using the
pre-existing struct zpci_dev and/or struct pci_dev for the newly
re-added instance of the underlying device up until the point
where the struct zpci_dev is fully removed. Secondly serialize
the addition and removal of PCI functions such that re-adding
a new instance, after the old one is already being removed, will
wait for the removal to finish before adding the new instance.
This fix also builds on prior upstream work of serializing state
transitions for PCI devices e.g. from configured to standby.
[ Fix ]
Backport from mainline:
- 0d48566d4b58 s390/pci: rename lock member in struct zpci_dev
- bcb5d6c76903 s390/pci: introduce lock to synchronize state of zpci_dev's
- 6ee600bfbe0f s390/pci: remove hotplug slot when releasing the device
- c4a585e952ca s390/pci: Fix potential double remove of hotplug slot
- 42420c50c68f s390/pci: Fix missing check for zpci_create_device() error
return
- 05a2538f2b48 s390/pci: Fix duplicate pci_dev_put() in disable_slot() when
PF has child VFs
- d76f96332967 s390/pci: Remove redundant bus removal and disable from
zpci_release_device()
- 47c397844869 s390/pci: Prevent self deletion in disable_slot()
- 4b1815a52d7e s390/pci: Allow re-add of a reserved but not yet removed device
- 774a1fa880bc s390/pci: Serialize device addition and removal
[ Test Plan ]
The issue can be reproduced looking at the behavior of the kernel wrt
to NETH PCI functions. In fact, IBM Z firmware temporarily reserves
NETH PCI functions to check for pending service when the last FID of a
PCHID is deconfigured. When nothing is pending the PCI function is
immediately returned in the standby state, thus triggering this issue
quite reliably.
[ Where Problems Could Occur ]
The fix affects the PCI function lifecycle management in the s390 PCI
hotplug infrastructure, specifically the serialization and reuse logic
of zpci_dev and pci_dev structures during rapid remove and re-add
cycles. An issue with this fix may introduce problems such as stale or
incorrectly reused device state, leading to improper reinitialization
of PCI functions.
---
Description: s390/pci: Fix immediate re-add of PCI function after
remove
Symptom: A PCI function may be reserved directly after being
deconfigured. If it subsequently returns back in the standby
state Linux may not be able to use the new instance generating
a kernel warning about trying to create an already existing
sysfs file for the IOMMU.
Problem: The problem occurs because the new instance of the same
underlying device is created before the prior instance is
completely torn down. This happens because the lifetime of the
PCI device representation in Linux is determined by reference
counts. A driver, the network stack, or even user-space
(including via vfio-pci) may be holding onto the device
represenation even after the underlying device is gone.
Solution: The solution to this is twofold. Firstly allow re-using the
pre-existing struct zpci_dev and/or struct pci_dev for the
newly
re-added instance of the underlying device up until the point
where the struct zpci_dev is fully removed. Secondly serialize
the addition and removal of PCI functions such that re-adding
a new instance, after the old one is already being removed,
will
wait for the removal to finish before adding the new instance.
This fix also builds on prior upstream work of serializing
state
transitions for PCI devices e.g. from configured to standby.
Reproduction: This problem was originally found with firmware which
temporarily reserves NETH PCI functions to check for pending
service when the last FID of a PCHID is deconfigured. When
nothing is pending the PCI function is immediately returned in
the standby state, thus triggering this issue quite reliably.
Upstream-ID: 0d48566d4b58946c8e1b0baac0347616060a81c9
bcb5d6c769039c8358a2359e7c3ea5d97ce93108
6ee600bfbe0f818ffb7748d99e9b0c89d0d9f02a
c4a585e952ca403a370586d3f16e8331a7564901
42420c50c68f3e95e90de2479464f420602229fc
05a2538f2b48500cf4e8a0a0ce76623cc5bafcf1
d76f9633296785343d45f85199f4138cb724b6d2
47c397844869ad0e6738afb5879c7492f4691122
4b1815a52d7eb03b3e0e6742c6728bc16a4b2d1d
774a1fa880bc949d88b5ddec9494a13be733dfa8
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/2114174/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp