Patches submitted to kernel team list: https://lists.ubuntu.com/archives/kernel-team/2025-May/159553.html
---- SRU Justification: [Impact] The PCI ACS capability parameter is used to enable and configure access control between PCIe devices. In particular, this parameter can enable and/or restrict peer-to-peer traffic between so-configured PCIe devices. For example, this parameter is necessary for GPUDirect RDMA applications, where peer-to-peer communication between a GPU and an RDMA-capable device is required. This parameter allows an administrator to configure the system for the specific level of isolation between PCIe devices required to enable this feature for their use case. [Fix] For Oracular, this consists of a clean cherry pick from mainline of commit 9cf8a952d57b ("PCI/ACS: Fix 'pci=config_acs=' parameter") to fix the functionality of the config_acs parameter introduced by commit 47c8846a49ba ("PCI: Extend ACS configurability"). For Noble, this consists of clean cherry picks of commits 47c8846a49ba ("PCI: Extend ACS configurability") and 9cf8a952d57b ("PCI/ACS: Fix 'pci=config_acs=' parameter"). [Test Plan] The Noble and Oracular patchsets were tested on a DGX GH200 system by booting with the kernel parameter test cases described in the commit message of 9cf8a952d57b ("PCI/ACS: Fix 'pci=config_acs=' parameter"). Multiple PCIe devices could be configured with the pci=config_acs parameter as is expected with the fix commit, and pci=disable_acs_redir works as expected. [Where problems could occur] This affects the pci=config_acs and pci=config_acs_redir kernel boot parameters. Issues could arise as malfunctioning of these two boot parameters, or as improper configuration of PCIe devices. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2100340 Title: Backport pci=config_acs parameter with fix commit Status in linux package in Ubuntu: Invalid Status in linux-nvidia package in Ubuntu: Invalid Status in linux source package in Noble: In Progress Status in linux-nvidia source package in Noble: Fix Released Status in linux source package in Oracular: In Progress Status in linux source package in Plucky: Fix Committed Bug description: Linux kernel upstream Commit 47c8846a49ba ("PCI: Extend ACS configurability") introduced bugs that fail to configure ACS ctrl to the value specified by the kernel parameter. Essentially there are two bugs: 1) When ACS is configured for multiple PCI devices using 'config_acs' kernel parameter, it results into error "PCI: Can't parse ACS command line parameter". This is due to a bug that doesn't preserve the ACS mask, but instead overwrites the mask with value 0. For example, using 'config_acs' to configure ACS ctrl for multiple BDFs fails: Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" PCI: Can't parse ACS command line parameter pci 0020:02:00.0: ACS mask = 0x007f pci 0020:02:00.0: ACS flags = 0x007b pci 0020:02:00.0: Configured ACS to 0x007b After this fix: Kernel command line: pci=config_acs=1111011@0020:02:00.0;101xxxx@0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x007f pci 0020:02:00.0: ACS flags = 0x007b pci 0020:02:00.0: ACS control = 0x005f pci 0020:02:00.0: ACS fw_ctrl = 0x0053 pci 0020:02:00.0: Configured ACS to 0x007b pci 0039:00:00.0: ACS mask = 0x0070 pci 0039:00:00.0: ACS flags = 0x0050 pci 0039:00:00.0: ACS control = 0x001d pci 0039:00:00.0: ACS fw_ctrl = 0x0000 pci 0039:00:00.0: Configured ACS to 0x0050 2) In the bit manipulation logic, we copy the bit from the firmware settings when mask bit 0. For example, 'disable_acs_redir' fails to clear all three ACS P2P redir bits due to the wrong bit fiddling: Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x002c pci 0020:02:00.0: ACS flags = 0xffd3 pci 0020:02:00.0: Configured ACS to 0xfffb pci 0030:02:00.0: ACS mask = 0x002c pci 0030:02:00.0: ACS flags = 0xffd3 pci 0030:02:00.0: Configured ACS to 0xffdf pci 0039:00:00.0: ACS mask = 0x002c pci 0039:00:00.0: ACS flags = 0xffd3 pci 0039:00:00.0: Configured ACS to 0xffd3 After this fix: Kernel command line: pci=disable_acs_redir=0020:02:00.0;0030:02:00.0;0039:00:00.0 "dyndbg=file drivers/pci/pci.c +p" pci 0020:02:00.0: ACS mask = 0x002c pci 0020:02:00.0: ACS flags = 0xffd3 pci 0020:02:00.0: ACS control = 0x007f pci 0020:02:00.0: ACS fw_ctrl = 0x007b pci 0020:02:00.0: Configured ACS to 0x0053 pci 0030:02:00.0: ACS mask = 0x002c pci 0030:02:00.0: ACS flags = 0xffd3 pci 0030:02:00.0: ACS control = 0x005f pci 0030:02:00.0: ACS fw_ctrl = 0x005f pci 0030:02:00.0: Configured ACS to 0x0053 pci 0039:00:00.0: ACS mask = 0x002c pci 0039:00:00.0: ACS flags = 0xffd3 pci 0039:00:00.0: ACS control = 0x001d pci 0039:00:00.0: ACS fw_ctrl = 0x0000 pci 0039:00:00.0: Configured ACS to 0x0000 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2100340/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp