Thank you for your contribution! Sashiko AI review found 4 potential issue(s) 
to consider:
- [High] Bypassing the Linux driver core by manually invoking probe and remove 
callbacks causes Use-After-Free on failure.
- [High] Unbalanced decrement of the `xe->wedged.ref` counter incorrectly 
permanently wedges the device after a non-fatal error recovery.
- [High] Unprotected Read-Modify-Write of Upstream Switch Port (USP) PCIe AER 
registers causes race conditions and corrupts switch state.
- [Medium] Unbounded firmware query loops in `get_queued_pages()` and 
`get_offlined_list()` allow a malfunctioning sysctrl firmware to hang the 
kernel.
--

commit 1dea0dcd39b5b10ff25f34fd2fa55710387abdc7
Author: Riana Tauro <[email protected]>
Subject: Introduce Xe Uncorrectable Error Handling

This patch introduces PCI error handling to the Xe driver to process
and recover from hardware and firmware errors. It queries the system
controller for RAS error details and handles device teardown and re-probe
sequences during slot resets.

Link: https://patchwork.freedesktop.org/series/160482/

> diff --git a/drivers/gpu/drm/xe/xe_pci_error.c 
> b/drivers/gpu/drm/xe/xe_pci_error.c
> new file mode 100644
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_pci_error.c
[ ... ]
> +static pci_ers_result_t xe_pci_error_detected(struct pci_dev *pdev, 
> pci_channel_state_t state)
> +{
> +     struct xe_device *xe = pdev_to_xe_device(pdev);
> +
> +     xe_err(xe, "PCI error: detected state = %u\n", state);
> +
> +     if (state == pci_channel_io_perm_failure)
> +             return PCI_ERS_RESULT_DISCONNECT;
> +
> +     /* If the device is already wedged or in survivability mode, do not 
> attempt recovery */
> +     if (xe_survivability_mode_is_boot_enabled(xe) || xe_device_wedged(xe))
> +             return PCI_ERS_RESULT_DISCONNECT;
> +
> +     switch (state) {
> +     case pci_channel_io_normal:
> +             return PCI_ERS_RESULT_CAN_RECOVER;
[ ... ]
> +static pci_ers_result_t xe_pci_error_slot_reset(struct pci_dev *pdev)
> +{
[ ... ]
> +     /*
> +      * Secondary Bus Reset causes all VRAM state to be lost along with
> +      * hardware state. As an initial step, re-probe the device to
> +      * re-initialize the driver and hardware.
> +      * TODO: optimize by re-initializing only the hardware state and 
> re-creating
> +      * kernel BOs.
> +      */
> +     xe_device_clear_in_reset(xe);
> +     pdev->driver->remove(pdev);
> +     devres_release_group(&pdev->dev, xe->devres_group_id);
> +
> +     if (pdev->driver->probe(pdev, ent))
> +             return PCI_ERS_RESULT_DISCONNECT;

[Severity: High]
Can this manual driver teardown cause a use-after-free?

If pdev->driver->probe() fails before updating the driver data pointer,
pdev->dev.driver_data will remain pointing to the memory freed by
devres_release_group().

The PCI core might later call xe_pci_remove(), which will dereference
this stale pointer. Would it be safer to avoid manually calling driver
lifecycle callbacks?

> +
> +     xe = pdev_to_xe_device(pdev);
> +
> +     /* Wedge the device to prevent I/O operations till the resume callback 
> */
> +     xe_device_wedged_get(xe);
> +
> +     return PCI_ERS_RESULT_RECOVERED;
> +}
> +
> +static void xe_pci_error_resume(struct pci_dev *pdev)
> +{
> +     struct xe_device *xe = pdev_to_xe_device(pdev);
> +
> +     xe_err(xe, "PCI error: resume\n");
> +
> +     /* Resume I/O operations */
> +     xe_device_wedged_put(xe);

[Severity: High]
Does this unconditional put cause an underflow when recovering from non-fatal
errors?

In xe_pci_error_detected(), the pci_channel_io_normal path returns
PCI_ERS_RESULT_CAN_RECOVER without calling prepare_device_for_reset(), so
the wedged refcount is never incremented.

When xe_pci_error_resume() is later called, this put will decrement the
counter to -1, which incorrectly evaluates as true in xe_device_wedged()
and permanently wedges the device.

> +}
> diff --git a/drivers/gpu/drm/xe/xe_ras.c b/drivers/gpu/drm/xe/xe_ras.c
> new file mode 100644
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_ras.c
[ ... ]
> +static void get_queued_pages(struct xe_device *xe)
> +{
> +     struct xe_sysctrl_mailbox_command command = {0};
> +     struct xe_ras_page_offline_queue response = {0};
> +     u32 count = 0;
> +     size_t rlen;
> +     int ret, i;
[ ... ]
> +     do {
> +             memset(&response, 0, sizeof(response));
> +
> +             ret = xe_sysctrl_send_command(&xe->sc, &command, &rlen);
[ ... ]
> +     } while (response.additional_data);

[Severity: Medium]
Could a malfunctioning sysctrl firmware cause a soft lockup here?

The loop relies entirely on the firmware's additional_data flag and
total_pages limit. Unlike xe_ras_process_errors() which enforces an
XE_SYSCTRL_FLOOD_LIMIT, this loop appears to lack a driver-enforced
iteration ceiling.

> +}
> +
> +static void get_offlined_list(struct xe_device *xe)
> +{
> +     struct xe_sysctrl_mailbox_command command = {0};
> +     struct xe_ras_page_offline_list response = {0};
> +     u32 count = 0;
> +     size_t rlen;
> +     int ret, i;
[ ... ]
> +     do {
> +             memset(&response, 0, sizeof(response));
> +
> +             ret = xe_sysctrl_send_command(&xe->sc, &command, &rlen);
[ ... ]
> +     } while (response.additional_data);

[Severity: Medium]
Is there a similar risk of a soft lockup in this loop if the firmware
continuously returns additional_data?

> +}
[ ... ]
> +static void aer_unmask_and_downgrade_internal_error(struct xe_device *xe)
> +{
> +     struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> +     u32 aer_uncorr_mask, aer_uncorr_sev, aer_uncorr_status;
> +     struct pci_dev *usp;
> +     u16 aer_cap;
> +
> +     usp = find_usp_dev(pdev);
> +     if (!usp)
> +             return;
[ ... ]
> +     /* Initialize Uncorrectable Error Severity Register */
> +     pci_read_config_dword(usp, aer_cap + PCI_ERR_UNCOR_SEVER, 
> &aer_uncorr_sev);
> +     aer_uncorr_sev &= ~PCI_ERR_UNC_INTN;
> +     pci_write_config_dword(usp, aer_cap + PCI_ERR_UNCOR_SEVER, 
> aer_uncorr_sev);
> +
> +     /* Initialize Uncorrectable Error Mask Register */
> +     pci_read_config_dword(usp, aer_cap + PCI_ERR_UNCOR_MASK, 
> &aer_uncorr_mask);
> +     aer_uncorr_mask &= ~PCI_ERR_UNC_INTN;
> +     pci_write_config_dword(usp, aer_cap + PCI_ERR_UNCOR_MASK, 
> aer_uncorr_mask);
> +
> +     pci_save_state(usp);

[Severity: High]
Can this read-modify-write sequence corrupt the switch state if multiple
endpoints probe concurrently?

The upstream switch port is a shared resource, and modifying its AER
capability registers without synchronization could lead to race conditions
when other child devices (such as SR-IOV VFs) execute this same path.

> +     dev_dbg(&usp->dev, "Uncorrectable Internal Errors downgraded and 
> unmasked\n");
> +}

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to