The following evaluation was done in early May ....

    In summary, EDR is brand new in upstream kernels and to my                  
                                                                                
                    
knowledge has not been exercised in-house at Canonical, and may be              
                                                                                
                    
difficult to test due to a lack of systems with EDR functionality in            
                                                                                
                    
their firmware.  I am unsure as to what the state of regression testing         
                                                                                
                    
is with regards to this type of PCI functionality.                              
                                                                                
                    
                                                                                
                                                                                
                    
    Some followup patches and discussion after the original EDR                 
                                                                                
                    
submission are below.                                                           
                                                                                
                    
                                                                                
                                                                                
                    
    To be clear, these are not part of the request from Nvidia.                 
                                                                                
                    
These are upstream followups to the EDR patch set, and are indicative of        
                                                                                
                    
the final state of this functionality still being in flux.                      
                                                                                
                    
                                                                                
                                                                                
                    
https://lore.kernel.org/linux-pci/1588272369-2145-1-git-send-email-jonathan.derr...@intel.com/
                                                                                
      
                                                                                
                                                                                
                    
    The above is unapplied upstream; it is proposed to fix an issue             
                                                                                
                    
with the negotiation between the operating system and firmware for              
                                                                                
                    
control for DPC, and as of 1 May 2020 is pending waiting for                    
                                                                                
                    
                                                                                
                                                                                
                    
https://lore.kernel.org/linux-pci/67af2931705bed9a588b5a39d369cb70b9942190.1587925636.git.sathyanarayanan.kuppusw...@linux.intel.com/
                                               
                                                                                
                                                                                
                    
        to be sorted out.                                                       
                                                                                
                    
                                                                                
                                                                                
                    
    This patch (which is applied upstream to pci git tree for                   
                                                                                
                    
merging in 5.8) relates to how the operating system and firmware                
                                                                                
                    
negotiate control of AER and DPC via the "firmware first" bit in the            
                                                                                
                    
ACPI HEST vs. using the _OSC ACPI method.  Prior to this patch, the ACPI        
                                                                                
                    
HEST takes priority; after this patch, _OSC is the only method                  
                                                                                
                    
consulted.                                                                      
                                                                                
                    
                                                                                
                                                                                
                    
    This second patch could:                                                    
                                                                                
                    
                                                                                
                                                                                
                    
    (a) affect Nvidia's use of DPC / EDR, depending upon how their              
                                                                                
                    
firmware is negotiating the control with the operating system, and              
                                                                                
                    
                                                                                
                                                                                
                    
    (b) affect our existing installed base platforms that currently             
                                                                                
                    
end up negotiating one way, but end up with a different result under            
                                                                                
                    
this patch.  E.g., the platform advertises "firmware first" in the ACPI         
                                                                                
                    
HEST, but not via _OSC.  I think it is more likely that server class            
                                                                                
                    
systems will already do this correctly, and that embedded / "IoT" type          
                                                                                
                    
devices are more likely to see an impact from this change.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1885030

Title:
  [Intel] Add Error Disconnect Recover support

Status in intel:
  New
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Focal:
  Incomplete
Status in linux source package in Groovy:
  Incomplete

Bug description:
  As per the following PCIe spec ECNs, implement EDR support in Linux
  kernel and upstream it.

  https://members.pcisig.com/wg/PCI-SIG/document/14076

  https://members.pcisig.com/wg/PCI-SIG/document/12888

  Patchset (Merged 5.7 mainline):
  894020fdd88c PCI/AER: Rationalize error status register clearing
  ac1c8e35a326 PCI/DPC: Add Error Disconnect Recover (EDR) support
  aea47413e7ce PCI/DPC: Expose dpc_process_error(), dpc_reset_link() for use by 
EDR
  20e15e673b05 PCI/AER: Add pci_aer_raw_clear_status() to unconditionally clear 
Error Status
  27005618178e PCI/DPC: Cache DPC capabilities in pci_init_capabilities()
  e8e5ff2aeec1 PCI/ERR: Return status of pcie_do_recovery()
  b6cf1a42f916 PCI/ERR: Remove service dependency in pcie_do_recovery()
  be06c1b42eea PCI/DPC: Move DPC data into struct pci_dev
  6d2c89441571 PCI/ERR: Update error status after reset_link()
  b5dfbeacf748 PCI/ERR: Combine pci_channel_io_frozen cases

To manage notifications about this bug go to:
https://bugs.launchpad.net/intel/+bug/1885030/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to