One problem faced during this approach was that the early-quirks code in
x86 performs a recursive search in the PCI bus descending from the
"first" bus 0000:00, and walking through all secondary busses by jumping
between bridges. For historical perspective about this code's evolution,
see [0].

This is not enough in multi-processor systems, which may have multiple PCIe 
root complexes, exposing many root ports and so describing multiple hierarchy 
domains. The PCIe spec even doesn't guarantee those hierarchies are capable of 
communicating; from PCIe spec 3.0, section 1.3.1: "[...] The capability to 
route peer-to-peer transactions between hierarchy domains through a Root
Complex is optional and implementation dependent. For example, an 
implementation may
incorporate a real or virtual Switch internally within the Root Complex to 
enable full peer-to-
peer support in a software transparent way."

Usually we don't see PCI devices unable to communicate to each other if
they are under different host bridges (aka root complexes in PCIe
terminology). But from a software perspective, what Linux sees are
multiple PCI devices organized in a tree way. The naive recursion from
check_dev_quirk() in arch/x86 can't reach all root complexes starting
always from bus 0000:00.

To exemplify how this tree would look like with a single or with multi root 
bridges, we'll attach outputs of "lspci -t" for 2 system next.
That said, we needed to change the bus scanning process to be comprehensive and 
walk through all buses. Good references for multi-root-complex PCIe BIOS probe 
(like its numbering rationale), [1] and [2].


[0] The early PCI scan dates back to BitKeeper, added by Andi Kleen's "[PATCH] 
APIC fixes for x86-64", on October/2003. It initially restricted the search to 
the first 32 busses and slots. Due to a potential bug found in Nvidia chipsets, 
the scan was changed to run only in the first root bus: see commit 8659c406ade3 
("x86: only scan the root bus in early PCI quirks").
Finally, secondary busses reachable from the first bus were re-added back by: 
commit 850c321027c2 ("x86/quirks: Reintroduce scanning of secondary buses").

[1] https://codywu2010.wordpress.com/2015/11/29/how-modern-multi-
processor-multi-root-complex-system-assigns-pci-bus-number/

[2] PCI Firmware Specification and the ACPI spec.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1797990

Title:
  kdump fail due to an IRQ storm

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Trusty:
  Confirmed
Status in linux source package in Xenial:
  Confirmed
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Cosmic:
  Confirmed

Bug description:
  We have reports of a kdump failure in Ubuntu (in x86 machine) that was
  narrowed down to a MSI irq storm coming from a PCI network device.

  The bug manifests as a lack of progress in the boot process of the
  kdump kernel, and a storm of kernel messages like:

  [...]
  [  342.265294] do_IRQ: 0.155 No irq handler for vector
  [  342.266916] do_IRQ: 0.155 No irq handler for vector
  [  347.258422] do_IRQ: 14053260 callbacks suppressed
  [...]

  The root cause of the issue is that the kdump kernel kexec process
  does not ensure PCI devices are reset and/or MSI capabilities are
  disabled, so a PCI device could produce a huge amount of PCI irqs
  which would take all the processing time for the CPU (specially since
  we restrict the kdump kernel to use one single CPU only).

  This was tested using upstream kernel version 4.18, and the problem 
reproduces.
  In the specific test scenario, the PCI NIC was an "Intel 82599ES 10-Gigabit 
[8086:10fb]" that was used in SR-IOV PCI passthrough mode (vfio_pci), under 
high load on the guest.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1797990/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to