Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider: - [High] Bypassing the Linux driver core by manually invoking probe and remove callbacks causes Use-After-Free on failure. - [High] Unbalanced decrement of the `xe->wedged.ref` counter incorrectly permanently wedges the device after a non-fatal error recovery. - [High] Unprotected Read-Modify-Write of Upstream Switch Port (USP) PCIe AER registers causes race conditions and corrupts switch state. - [Medium] Unbounded firmware query loops in `get_queued_pages()` and `get_offlined_list()` allow a malfunctioning sysctrl firmware to hang the kernel. --
commit 1dea0dcd39b5b10ff25f34fd2fa55710387abdc7 Author: Riana Tauro <[email protected]> Subject: Introduce Xe Uncorrectable Error Handling This patch introduces PCI error handling to the Xe driver to process and recover from hardware and firmware errors. It queries the system controller for RAS error details and handles device teardown and re-probe sequences during slot resets. Link: https://patchwork.freedesktop.org/series/160482/ > diff --git a/drivers/gpu/drm/xe/xe_pci_error.c > b/drivers/gpu/drm/xe/xe_pci_error.c > new file mode 100644 > --- /dev/null > +++ b/drivers/gpu/drm/xe/xe_pci_error.c [ ... ] > +static pci_ers_result_t xe_pci_error_detected(struct pci_dev *pdev, > pci_channel_state_t state) > +{ > + struct xe_device *xe = pdev_to_xe_device(pdev); > + > + xe_err(xe, "PCI error: detected state = %u\n", state); > + > + if (state == pci_channel_io_perm_failure) > + return PCI_ERS_RESULT_DISCONNECT; > + > + /* If the device is already wedged or in survivability mode, do not > attempt recovery */ > + if (xe_survivability_mode_is_boot_enabled(xe) || xe_device_wedged(xe)) > + return PCI_ERS_RESULT_DISCONNECT; > + > + switch (state) { > + case pci_channel_io_normal: > + return PCI_ERS_RESULT_CAN_RECOVER; [ ... ] > +static pci_ers_result_t xe_pci_error_slot_reset(struct pci_dev *pdev) > +{ [ ... ] > + /* > + * Secondary Bus Reset causes all VRAM state to be lost along with > + * hardware state. As an initial step, re-probe the device to > + * re-initialize the driver and hardware. > + * TODO: optimize by re-initializing only the hardware state and > re-creating > + * kernel BOs. > + */ > + xe_device_clear_in_reset(xe); > + pdev->driver->remove(pdev); > + devres_release_group(&pdev->dev, xe->devres_group_id); > + > + if (pdev->driver->probe(pdev, ent)) > + return PCI_ERS_RESULT_DISCONNECT; [Severity: High] Can this manual driver teardown cause a use-after-free? If pdev->driver->probe() fails before updating the driver data pointer, pdev->dev.driver_data will remain pointing to the memory freed by devres_release_group(). The PCI core might later call xe_pci_remove(), which will dereference this stale pointer. Would it be safer to avoid manually calling driver lifecycle callbacks? > + > + xe = pdev_to_xe_device(pdev); > + > + /* Wedge the device to prevent I/O operations till the resume callback > */ > + xe_device_wedged_get(xe); > + > + return PCI_ERS_RESULT_RECOVERED; > +} > + > +static void xe_pci_error_resume(struct pci_dev *pdev) > +{ > + struct xe_device *xe = pdev_to_xe_device(pdev); > + > + xe_err(xe, "PCI error: resume\n"); > + > + /* Resume I/O operations */ > + xe_device_wedged_put(xe); [Severity: High] Does this unconditional put cause an underflow when recovering from non-fatal errors? In xe_pci_error_detected(), the pci_channel_io_normal path returns PCI_ERS_RESULT_CAN_RECOVER without calling prepare_device_for_reset(), so the wedged refcount is never incremented. When xe_pci_error_resume() is later called, this put will decrement the counter to -1, which incorrectly evaluates as true in xe_device_wedged() and permanently wedges the device. > +} > diff --git a/drivers/gpu/drm/xe/xe_ras.c b/drivers/gpu/drm/xe/xe_ras.c > new file mode 100644 > --- /dev/null > +++ b/drivers/gpu/drm/xe/xe_ras.c [ ... ] > +static void get_queued_pages(struct xe_device *xe) > +{ > + struct xe_sysctrl_mailbox_command command = {0}; > + struct xe_ras_page_offline_queue response = {0}; > + u32 count = 0; > + size_t rlen; > + int ret, i; [ ... ] > + do { > + memset(&response, 0, sizeof(response)); > + > + ret = xe_sysctrl_send_command(&xe->sc, &command, &rlen); [ ... ] > + } while (response.additional_data); [Severity: Medium] Could a malfunctioning sysctrl firmware cause a soft lockup here? The loop relies entirely on the firmware's additional_data flag and total_pages limit. Unlike xe_ras_process_errors() which enforces an XE_SYSCTRL_FLOOD_LIMIT, this loop appears to lack a driver-enforced iteration ceiling. > +} > + > +static void get_offlined_list(struct xe_device *xe) > +{ > + struct xe_sysctrl_mailbox_command command = {0}; > + struct xe_ras_page_offline_list response = {0}; > + u32 count = 0; > + size_t rlen; > + int ret, i; [ ... ] > + do { > + memset(&response, 0, sizeof(response)); > + > + ret = xe_sysctrl_send_command(&xe->sc, &command, &rlen); [ ... ] > + } while (response.additional_data); [Severity: Medium] Is there a similar risk of a soft lockup in this loop if the firmware continuously returns additional_data? > +} [ ... ] > +static void aer_unmask_and_downgrade_internal_error(struct xe_device *xe) > +{ > + struct pci_dev *pdev = to_pci_dev(xe->drm.dev); > + u32 aer_uncorr_mask, aer_uncorr_sev, aer_uncorr_status; > + struct pci_dev *usp; > + u16 aer_cap; > + > + usp = find_usp_dev(pdev); > + if (!usp) > + return; [ ... ] > + /* Initialize Uncorrectable Error Severity Register */ > + pci_read_config_dword(usp, aer_cap + PCI_ERR_UNCOR_SEVER, > &aer_uncorr_sev); > + aer_uncorr_sev &= ~PCI_ERR_UNC_INTN; > + pci_write_config_dword(usp, aer_cap + PCI_ERR_UNCOR_SEVER, > aer_uncorr_sev); > + > + /* Initialize Uncorrectable Error Mask Register */ > + pci_read_config_dword(usp, aer_cap + PCI_ERR_UNCOR_MASK, > &aer_uncorr_mask); > + aer_uncorr_mask &= ~PCI_ERR_UNC_INTN; > + pci_write_config_dword(usp, aer_cap + PCI_ERR_UNCOR_MASK, > aer_uncorr_mask); > + > + pci_save_state(usp); [Severity: High] Can this read-modify-write sequence corrupt the switch state if multiple endpoints probe concurrently? The upstream switch port is a shared resource, and modifying its AER capability registers without synchronization could lead to race conditions when other child devices (such as SR-IOV VFs) execute this same path. > + dev_dbg(&usp->dev, "Uncorrectable Internal Errors downgraded and > unmasked\n"); > +} -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=1
