On Mon, Jul 17, 2023 at 03:04:57PM +0300, Ilpo Järvinen wrote:
> Don't assume that only the driver would be accessing LNKCTL. ASPM
> policy changes can trigger write to LNKCTL outside of driver's control.
> And in the case of upstream bridge, the driver does not even own the
> device it's changing the registers for.
> 
> Use RMW capability accessors which do proper locking to avoid losing
> concurrent updates to the register value.
> 
> Fixes: a2e73f56fa62 ("drm/amdgpu: Add support for CIK parts")
> Fixes: 62a37553414a ("drm/amdgpu: add si implementation v10")
> Suggested-by: Lukas Wunner <[email protected]>
> Signed-off-by: Ilpo Järvinen <[email protected]>
> Cc: [email protected]

Do we have any reports of problems that are fixed by this patch (or by
others in the series)?  If not, I'm not sure it really fits the usual
stable kernel criteria:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst?id=v6.4

> ---
>  drivers/gpu/drm/amd/amdgpu/cik.c | 36 +++++++++-----------------------
>  drivers/gpu/drm/amd/amdgpu/si.c  | 36 +++++++++-----------------------
>  2 files changed, 20 insertions(+), 52 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c 
> b/drivers/gpu/drm/amd/amdgpu/cik.c
> index 5641cf05d856..e63abdf52b6c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik.c
> @@ -1574,17 +1574,8 @@ static void cik_pcie_gen3_enable(struct amdgpu_device 
> *adev)
>                       u16 bridge_cfg2, gpu_cfg2;
>                       u32 max_lw, current_lw, tmp;
>  
> -                     pcie_capability_read_word(root, PCI_EXP_LNKCTL,
> -                                               &bridge_cfg);
> -                     pcie_capability_read_word(adev->pdev, PCI_EXP_LNKCTL,
> -                                               &gpu_cfg);
> -
> -                     tmp16 = bridge_cfg | PCI_EXP_LNKCTL_HAWD;
> -                     pcie_capability_write_word(root, PCI_EXP_LNKCTL, tmp16);
> -
> -                     tmp16 = gpu_cfg | PCI_EXP_LNKCTL_HAWD;
> -                     pcie_capability_write_word(adev->pdev, PCI_EXP_LNKCTL,
> -                                                tmp16);
> +                     pcie_capability_set_word(root, PCI_EXP_LNKCTL, 
> PCI_EXP_LNKCTL_HAWD);
> +                     pcie_capability_set_word(adev->pdev, PCI_EXP_LNKCTL, 
> PCI_EXP_LNKCTL_HAWD);
>  
>                       tmp = RREG32_PCIE(ixPCIE_LC_STATUS1);
>                       max_lw = (tmp & 
> PCIE_LC_STATUS1__LC_DETECTED_LINK_WIDTH_MASK) >>
> @@ -1637,21 +1628,14 @@ static void cik_pcie_gen3_enable(struct amdgpu_device 
> *adev)
>                               msleep(100);
>  
>                               /* linkctl */
> -                             pcie_capability_read_word(root, PCI_EXP_LNKCTL,
> -                                                       &tmp16);
> -                             tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
> -                             tmp16 |= (bridge_cfg & PCI_EXP_LNKCTL_HAWD);
> -                             pcie_capability_write_word(root, PCI_EXP_LNKCTL,
> -                                                        tmp16);
> -
> -                             pcie_capability_read_word(adev->pdev,
> -                                                       PCI_EXP_LNKCTL,
> -                                                       &tmp16);
> -                             tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
> -                             tmp16 |= (gpu_cfg & PCI_EXP_LNKCTL_HAWD);
> -                             pcie_capability_write_word(adev->pdev,
> -                                                        PCI_EXP_LNKCTL,
> -                                                        tmp16);
> +                             pcie_capability_clear_and_set_word(root, 
> PCI_EXP_LNKCTL,
> +                                                                
> PCI_EXP_LNKCTL_HAWD,
> +                                                                bridge_cfg &
> +                                                                
> PCI_EXP_LNKCTL_HAWD);
> +                             pcie_capability_clear_and_set_word(adev->pdev, 
> PCI_EXP_LNKCTL,
> +                                                                
> PCI_EXP_LNKCTL_HAWD,
> +                                                                gpu_cfg &
> +                                                                
> PCI_EXP_LNKCTL_HAWD);

Wow, there's a lot of pointless-looking work going on here:

  set root PCI_EXP_LNKCTL_HAWD
  set GPU  PCI_EXP_LNKCTL_HAWD

  for (i = 0; i < 10; i++) {
    read root PCI_EXP_LNKCTL
    read GPU  PCI_EXP_LNKCTL

    clear root PCI_EXP_LNKCTL_HAWD
    if (root PCI_EXP_LNKCTL_HAWD was set)
      set root PCI_EXP_LNKCTL_HAWD

    clear GPU  PCI_EXP_LNKCTL_HAWD
    if (GPU  PCI_EXP_LNKCTL_HAWD was set)
      set GPU  PCI_EXP_LNKCTL_HAWD
  }

If it really *is* pointless, it would be nice to clean it up, but that
wouldn't be material for this patch, so what you have looks good.

>                               /* linkctl2 */
>                               pcie_capability_read_word(root, PCI_EXP_LNKCTL2,

The PCI_EXP_LNKCTL2 stuff also includes RMW updates.  I don't see any
uses of PCI_EXP_LNKCTL2 outside this driver that look relevant, so I
guess we don't care about making the PCI_EXP_LNKCTL2 updates atomic?

Bjorn

Reply via email to