The following evaluation was done in early May ....
In summary, EDR is brand new in upstream kernels and to my knowledge has not been exercised in-house at Canonical, and may be difficult to test due to a lack of systems with EDR functionality in their firmware. I am unsure as to what the state of regression testing is with regards to this type of PCI functionality. Some followup patches and discussion after the original EDR submission are below. To be clear, these are not part of the request from Nvidia. These are upstream followups to the EDR patch set, and are indicative of the final state of this functionality still being in flux. https://lore.kernel.org/linux-pci/1588272369-2145-1-git-send-email-jonathan.derr...@intel.com/ The above is unapplied upstream; it is proposed to fix an issue with the negotiation between the operating system and firmware for control for DPC, and as of 1 May 2020 is pending waiting for https://lore.kernel.org/linux-pci/67af2931705bed9a588b5a39d369cb70b9942190.1587925636.git.sathyanarayanan.kuppusw...@linux.intel.com/ to be sorted out. This patch (which is applied upstream to pci git tree for merging in 5.8) relates to how the operating system and firmware negotiate control of AER and DPC via the "firmware first" bit in the ACPI HEST vs. using the _OSC ACPI method. Prior to this patch, the ACPI HEST takes priority; after this patch, _OSC is the only method consulted. This second patch could: (a) affect Nvidia's use of DPC / EDR, depending upon how their firmware is negotiating the control with the operating system, and (b) affect our existing installed base platforms that currently end up negotiating one way, but end up with a different result under this patch. E.g., the platform advertises "firmware first" in the ACPI HEST, but not via _OSC. I think it is more likely that server class systems will already do this correctly, and that embedded / "IoT" type devices are more likely to see an impact from this change. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1885030 Title: [Intel] Add Error Disconnect Recover support Status in intel: New Status in linux package in Ubuntu: Incomplete Status in linux source package in Focal: Incomplete Status in linux source package in Groovy: Incomplete Bug description: As per the following PCIe spec ECNs, implement EDR support in Linux kernel and upstream it. https://members.pcisig.com/wg/PCI-SIG/document/14076 https://members.pcisig.com/wg/PCI-SIG/document/12888 Patchset (Merged 5.7 mainline): 894020fdd88c PCI/AER: Rationalize error status register clearing ac1c8e35a326 PCI/DPC: Add Error Disconnect Recover (EDR) support aea47413e7ce PCI/DPC: Expose dpc_process_error(), dpc_reset_link() for use by EDR 20e15e673b05 PCI/AER: Add pci_aer_raw_clear_status() to unconditionally clear Error Status 27005618178e PCI/DPC: Cache DPC capabilities in pci_init_capabilities() e8e5ff2aeec1 PCI/ERR: Return status of pcie_do_recovery() b6cf1a42f916 PCI/ERR: Remove service dependency in pcie_do_recovery() be06c1b42eea PCI/DPC: Move DPC data into struct pci_dev 6d2c89441571 PCI/ERR: Update error status after reset_link() b5dfbeacf748 PCI/ERR: Combine pci_channel_io_frozen cases To manage notifications about this bug go to: https://bugs.launchpad.net/intel/+bug/1885030/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp