Returns different error codes based on the scenario to help the user app
understand
the AMDGPU device status when an exception occurs.
Signed-off-by: Yang Wang
---
drivers/gpu/drm/amd/pm/amdgpu_pm.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/
On 8/17/2025 9:00 PM, Mario Limonciello (AMD) wrote:
A variety of issues both in function and in power consumption have been
raised as a result of devices not being put into a low power state when
the system is powered off.
There have been some localized changes[1] to PCI core to help these issu
On 8/29/2025 10:01 AM, Antheas Kapenekakis wrote:
On Fri, 29 Aug 2025 at 16:57, Antheas Kapenekakis wrote:
Currently, when a panel brightness quirk is applied, there is no log
indicating that a quirk was applied. Unwrap the drm device on its own
and use drm_info() to log when a quirk is applie
Currently the kzalloc failure check just sets reports the failure
and sets the variable ret to -ENOMEM, which is not checked later
for this specific error. Fix this by just returning -ENOMEM rather
than setting ret.
Fixes: 4fb930715468 ("drm/amd/amdgpu: remove redundant host to psp cmd buf
alloca
Implement TTM-level behavior for AMDGPU_PL_MMIO_REMAP so it behaves as a
CPU-visible IO page:
* amdgpu_evict_flags(): mark as unmovable
* amdgpu_res_cpu_visible(): consider CPU-visible
* amdgpu_bo_move(): use null move when src/dst is MMIO_REMAP
* amdgpu_ttm_io_mem_reserve(): program base/is_iomem
On Tue, Sep 2, 2025 at 3:05 PM David Francis wrote:
>
> With the addition of the drm ioctl
> DRM_IOCTL_AMDGPU_GEM_LIST_HANDLES,
> the drm driver version should be incremented (to 65)
>
> Signed-off-by: David Francis
Reviewed-by: Alex Deucher
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3
[Public]
Regards,
Prike
> -Original Message-
> From: amd-gfx On Behalf Of Christian
> König
> Sent: Thursday, August 28, 2025 11:02 PM
> To: Khatri, Sunil
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
>
> Subject: [PATCH 1/3] drm/amdgpu: fix userq VM validation v3
>
> T
Add a new GEM domain bit AMDGPU_GEM_DOMAIN_MMIO_REMAP to allow
userspace to request the MMIO remap (HDP flush) page via GEM_CREATE.
- include/uapi/drm/amdgpu_drm.h:
* define AMDGPU_GEM_DOMAIN_MMIO_REMAP
* include the bit in AMDGPU_GEM_DOMAIN_MASK
v2: Add early reject in amdgpu_gem_create_ioct
On 02.09.25 10:08, Timur Kristóf wrote:
> On Tue, 2025-09-02 at 08:43 +0200, Christian König wrote:
>> On 01.09.25 12:00, Timur Kristóf wrote:
>>> To avoid confusion with dwords.
>>>
>>> Signed-off-by: Timur Kristóf
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++--
>>> 1 file chang
Implement support for the hung queue detect and reset
functionality.
v2: Always use AMDGPU_MES_SCHED_PIPE
Signed-off-by: Alex Deucher
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 31 ++
1 file changed, 31 insertions(+)
diff --git a/drivers/gp
On Sun, Aug 31, 2025 at 6:53 AM Srinivasan Shanmugam
wrote:
>
> The header comment above amdgpu_gem_list_handles_ioctl referenced
> drm_amdgpu_gem_list_handles_ioctl. Update the comment to reflect the
> actual function identifier to avoid misleading prototype warnings.
>
> Fixes the below:
> drive
On Tue, 2025-09-02 at 08:41 +0200, Christian König wrote:
> On 01.09.25 12:00, Timur Kristóf wrote:
> > The amdgpu_bo_create_kernel function takes a byte count,
> > so we need to multiply the extra dword count by four.
> > (The ring_size is already in bytes so that one is correct here.)
>
> Good c
On Mon, Aug 25, 2025 at 10:33 AM Eric Huang wrote:
>
> When creating p2p links, KFD needs to check XGMI link
> with two conditions, hive_id and is_sharing_enabled,
> but it is missing to check is_sharing_enabled, so add
> it to fix the error.
>
> Signed-off-by: Eric Huang
Acked-by: Alex Deucher
From: Wenjing Liu
[why]
dchubbub supports performance monitoring for hubbub.
The interfaces define the performance monitoring events and their
attributes.
Reviewed-by: Alvin Lee
Signed-off-by: Wenjing Liu
Signed-off-by: Wayne Lin
---
.../gpu/drm/amd/display/dc/inc/hw/dchubbub.h | 22 +++
Add a one-page TTM range manager for AMDGPU_PL_MMIO_REMAP via
amdgpu_ttm_init_on_chip(). This only registers the placement with TTM;
no BO is allocated in this patch.
The singleton 4K remap BO is created and freed in the following patch.
This split follows to separate heap bring-up from BO alloca
On 8/31/2025 5:12 AM, Przemysław Kopa wrote:
Hello,
I'm running Radeon RX 9060 XT and since upgrading to the kernel 6.15 I'm
facing an issue with audio via DisplayPort. After waking from S3 suspend
(sometimes, but not always) audio doesn't work - pavucontrol shows that
the output is disconnected
[Public]
Regards,
Prike
> -Original Message-
> From: amd-gfx On Behalf Of Christian
> König
> Sent: Thursday, August 28, 2025 11:02 PM
> To: Khatri, Sunil
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
>
> Subject: [PATCH 2/3] drm/amdgpu: remove check for BO reservation
On 8/29/2025 1:30 PM, David Francis wrote:
With the addition of the drm ioctl
DRM_IOCTL_AMDGPU_GEM_LIST_HANDLES,
the drm driver version should be incremented (to 65)
Signed-off-by: David Francis
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
From: Clay King
[Why & How]
Previously, when calculating dto phase, we would incorrectly fail when phase
<=0 without additionally checking for the integer value. This meant that
calculations would incorrectly fail when the desired pixel clock was an exact
multiple of the reference clock.
Reviewe
From: Ivan Lipski
[Why&How]
ON DCN314, clearing DPP SW structure without power gating it can cause a
double cursor in full screen with non-native scaling.
A W/A that clears CURSOR0_CONTROL cursor_enable flag if
dcn10_plane_atomic_power_down is called and DPP power gating is disabled.
Reviewed-b
On Tue, Sep 2, 2025 at 9:27 AM Christian König wrote:
>
> On 02.09.25 15:25, Alex Deucher wrote:
> > On Tue, Sep 2, 2025 at 3:38 AM Christian König
> > wrote:
> >>
> >> On 02.09.25 05:29, Srinivasan Shanmugam wrote:
> >>> Add mmio_remap bookkeeping to amdgpu_device and introduce
> >>> amdgpu_ttm
On 2025-08-28 10:08, Mario Limonciello (AMD) wrote:
> [Why]
> Although compositors will add their own modes, Xorg won't use it's own
> modes and will only stick to modes advertised by the driver. This mean a
> user that used to pick 1024x768 could no longer access it unless the
> panel's native
On Mon, Sep 1, 2025 at 5:13 AM Liang, Prike wrote:
>
> [Public]
>
>
>
> Regards,
> Prike
>
> > -Original Message-
> > From: Alex Deucher
> > Sent: Thursday, August 28, 2025 6:13 AM
> > To: Liang, Prike ; Koenig, Christian
> >
> > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexande
On 2025-08-29 5:58 a.m., Gustavo A. R. Silva wrote:
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
Move the conflicting declarations to the end of the corresponding
structures. Notice that `struct dev_pagemap` is a flexible structure,
th
This series introduces a kernel-managed singleton BO representing the
MMIO-remap (HDP flush) page and exposes it to userspace through a new GEM
domain.
Design
--
- A tiny (1-page) TTM bucket is introduced for AMDGPU_PL_MMIO_REMAP
(mirroring doorbells).
- A singleton BO is created during am
Enable userspace to obtain a handle to the kernel-owned MMIO_REMAP
singleton when AMDGPU_GEM_DOMAIN_MMIO_REMAP is requested via
amdgpu_gem_create_ioctl().
Validate the fixed 4K constraint: if PAGE_SIZE > AMDGPU_GPU_PAGE_SIZE
return -EINVAL; when provided, size and alignment must equal
AMDGPU_GPU_P
Wire up the conversions and strings for the new MMIO_REMAP placement:
* amdgpu_mem_type_to_domain() maps AMDGPU_PL_MMIO_REMAP -> domain
* amdgpu_bo_placement_from_domain() accepts the new domain
* amdgpu_bo_mem_stats_placement() and amdgpu_bo_print_info() report it
* res cursor supports the new pl
Introduce a kernel-internal TTM placement type AMDGPU_PL_MMIO_REMAP
for the HDP flush MMIO remap page
Plumbing added:
- amdgpu_res_cursor.{first,next}: treat MMIO_REMAP like DOORBELL
- amdgpu_ttm_io_mem_reserve(): return BAR bus address + offset
for MMIO_REMAP, mark as uncached I/O
- amdgpu_ttm_
Increase TTM_NUM_MEM_TYPES from 8 to 9 to accommodate the upcoming
AMDGPU_PL_MMIO_REMAP placement.
Cc: Alex Deucher
Suggested-by: Christian König
Signed-off-by: Srinivasan Shanmugam
Reviewed-by: Christian König
Reviewed-by: Alex Deucher
---
include/drm/ttm/ttm_resource.h | 2 +-
1 file chang
Correct valid_bits and ms_chk_bits of section info field for bad page
threshold exceed CPER to match OOB's behavior.
Signed-off-by: Xiang Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper
Applied. Thanks!
Alex
On Tue, Sep 2, 2025 at 8:49 AM Colin Ian King wrote:
>
> Currently the kzalloc failure check just sets reports the failure
> and sets the variable ret to -ENOMEM, which is not checked later
> for this specific error. Fix this by just returning -ENOMEM rather
> than setting
On Sun, Aug 31, 2025 at 6:13 AM Srinivasan Shanmugam
wrote:
>
> Align the function headers for `amdgpu_max_hdmi_pixel_clock` and
> `amdgpu_connector_dvi_mode_valid` with the function implementations so
> they match the expected kdoc style.
>
> Fixes the below:
> drivers/gpu/drm/amd/amdgpu/amdgpu_c
Ping ...
On 2025-08-25 10:23, Eric Huang wrote:
When creating p2p links, KFD needs to check XGMI link
with two conditions, hive_id and is_sharing_enabled,
but it is missing to check is_sharing_enabled, so add
it to fix the error.
Signed-off-by: Eric Huang
---
drivers/gpu/drm/amd/amdkfd/kfd_t
On 02.09.25 15:31, Alex Deucher wrote:
> On Tue, Sep 2, 2025 at 9:27 AM Christian König
> wrote:
>>
>> On 02.09.25 15:25, Alex Deucher wrote:
>>> On Tue, Sep 2, 2025 at 3:38 AM Christian König
>>> wrote:
On 02.09.25 05:29, Srinivasan Shanmugam wrote:
> Add mmio_remap bookkeeping t
[Public]
Hi all,
This week this patchset was tested on 4 systems, two dGPU and two APU based,
and tested across multiple display and connection types.
APU
* Single Display eDP -> 1080p 60hz, 1920x1200 165hz, 3840x2400 60hz
* Single Display DP (SST DSC) -> 4k144hz, 4k240hz
From: Taimur Hassan
Summary:
* Refactor bounding box values handling
* Fix incorrect condition to fail dto clk calculation
* Skip check downlink setting for a certain MST branch device
* Fix double cursor issue on dcn314
Signed-off-by: Taimur Hassan
Signed-off-by: Alex Hung
Tested-by: Dan Whe
From: Mario Limonciello
[Why]
Custom brightness curve works by walking through all data points one
by one. When the brightness value is at either extreme this is a lot
of data points to walk. This is especially noticeable when moving a
brightness slider around how it can lag.
[How]
Bisect the
On 02.09.25 15:25, Alex Deucher wrote:
> On Tue, Sep 2, 2025 at 3:38 AM Christian König
> wrote:
>>
>> On 02.09.25 05:29, Srinivasan Shanmugam wrote:
>>> Add mmio_remap bookkeeping to amdgpu_device and introduce
>>> amdgpu_ttm_mmio_remap_bo_init()/fini() to manage a kernel-owned,
>>> one-page (4K
On Mon, Sep 01, 2025 at 11:27:01AM +0200, Michel Dänzer wrote:
> use some kind of debug output API which doesn't hit dmesg by default
You still want to be enabled by default so that normal users can see it and
actually report it.
> (can be a non-once variant instead, that's more useful for user-s
Use KMEM_CACHE() instead of kmem_cache_create() to simplify the code.
Signed-off-by: Longlong Xia
---
drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 6 +-
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
b/drivers/gpu/drm/amd/am
On Mon, 2025-09-01 at 11:13 +0100, Tvrtko Ursulin wrote:
>
> Hi,
>
> On 01/09/2025 11:00, Timur Kristóf wrote:
> > Technically not necessary, but clear the extra dwords too,
> > so that the command processors don't read uninitialized memory.
> >
> > Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and a
From: Alex Deucher
Add a detect and reset callback and add the implementation
for mes. The callback will detect all hung queues of a
particular ip type (e.g., GFX or compute or SDMA) and
reset them.
v2: increase reset counter and set fence force completion
v3: Removed userq_mutex in mes_userq_d
On 02/09/2025 12:30, Timur Kristóf wrote:
On Mon, 2025-09-01 at 11:13 +0100, Tvrtko Ursulin wrote:
Hi,
On 01/09/2025 11:00, Timur Kristóf wrote:
Technically not necessary, but clear the extra dwords too,
so that the command processors don't read uninitialized memory.
Fixes: c8c1a1d2ef04 ("
On 02.09.25 09:27, Longlong Xia wrote:
> Use KMEM_CACHE() instead of kmem_cache_create() to simplify the code.
In general a good cleanup, but why are we using a separate kmem_cache here in
the first place?
SLAB_HWCACHE_ALIGN rounds up the struct size to 128 bytes and that is something
kzalloc()
On Tue, 2025-09-02 at 13:10 +0200, Christian König wrote:
> On 02.09.25 10:08, Timur Kristóf wrote:
> > On Tue, 2025-09-02 at 08:43 +0200, Christian König wrote:
> > > On 01.09.25 12:00, Timur Kristóf wrote:
> > > > To avoid confusion with dwords.
> > > >
> > > > Signed-off-by: Timur Kristóf
> >
On 02.09.25 10:26, Timur Kristóf wrote:
> On Tue, 2025-09-02 at 08:41 +0200, Christian König wrote:
>> On 01.09.25 12:00, Timur Kristóf wrote:
>>> The amdgpu_bo_create_kernel function takes a byte count,
>>> so we need to multiply the extra dword count by four.
>>> (The ring_size is already in byte
From: Alex Deucher
This patch adds robust reset handling for user queues (userq) to improve
recovery from queue failures. The key components include:
1. Queue detection and reset logic:
- amdgpu_userq_detect_and_reset_queues() identifies failed queues
- Per-IP detect_and_reset callbacks fo
On 9/2/2025 3:53 PM, Jani Nikula wrote:
On Tue, 02 Sep 2025, Arunpravin Paneer Selvam
wrote:
Replace the freelist (O(n)) used for free block management with a
red-black tree, providing more efficient O(log n) search, insert,
and delete operations. This improves scalability and performance
w
On Tue, 02 Sep 2025, Arunpravin Paneer Selvam
wrote:
> Replace the freelist (O(n)) used for free block management with a
> red-black tree, providing more efficient O(log n) search, insert,
> and delete operations. This improves scalability and performance
> when managing large numbers of free blo
On Tue, 2025-09-02 at 11:54 +0200, Christian König wrote:
> On 02.09.25 09:42, Timur Kristóf wrote:
> > On Tue, 2025-09-02 at 08:39 +0200, Christian König wrote:
> > > On 01.09.25 12:00, Timur Kristóf wrote:
> > > > Technically not necessary, but clear the extra dwords too,
> > > > so that the comm
On 02.09.25 09:42, Timur Kristóf wrote:
> On Tue, 2025-09-02 at 08:39 +0200, Christian König wrote:
>> On 01.09.25 12:00, Timur Kristóf wrote:
>>> Technically not necessary, but clear the extra dwords too,
>>> so that the command processors don't read uninitialized memory.
>>
>> That is most likely
Increase TTM_NUM_MEM_TYPES from 8 to 9 to accommodate the upcoming
AMDGPU_PL_MMIO_REMAP placement.
Cc: Alex Deucher
Suggested-by: Christian König
Signed-off-by: Srinivasan Shanmugam
Reviewed-by: Christian König
Reviewed-by: Alex Deucher
---
include/drm/ttm/ttm_resource.h | 2 +-
1 file chang
On 8/28/2025 8:32 PM, Christian König wrote:
This reverts commit 0479956c94b1cfa6a1ab9206eff76072944ece8b.
It turned out that protecting the status of each bo_va only with a
spinlock was just hiding problems instead of solving them.
Revert the whole approach, add a separate stats_lock and loc
LGTM but some else should check.
Acked-by: Sunil Khatri
On 8/28/2025 8:31 PM, Christian König wrote:
We should leave such checks to lockdep and not implement something
manually.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 +
1 file changed, 1
On 8/28/2025 8:31 PM, Christian König wrote:
That was actually complete nonsense and not validating the BOs
at all. The code just cleared all VM areas were it couldn't grab the
lock for a BO.
Try to fix this. Only compile tested at the moment.
v2: fix fence slot reservation as well as pointed
This commit implements the actual MES (Micro Engine Scheduler) suspend
and resume gang operations for version 12 hardware. Previously these
functions were just stubs returning success.
v2: Always use AMDGPU_MES_SCHED_PIPE
Signed-off-by: Alex Deucher
Signed-off-by: Jesse Zhang
---
drivers/gpu/d
From: Alex Deucher
Track resets from user queues.
Signed-off-by: Alex Deucher
Reviewed-by: Christian König
Reviewed-by: Sunil Khatri
---
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 3 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1 +
2 files changed, 4 insertions(+)
diff --git a/drivers/
From: Alex Deucher
Helper function to detect and reset hung queues. MES will
return an array of doorbell indices of which queues are hung
and were optionally reset.
v2: Clear the doorbell array before detection
Signed-off-by: Alex Deucher
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/
From: Alex Deucher
Use the suspend and resume API rather than remove queue
and add queue API. The former just preempts the queue
while the latter remove it from the scheduler completely.
There is no need to do that, we only need preemption
in this case.
V2: replace queue_active with queue state
From: Alex Deucher
Add two new function pointers to struct amdgpu_userq_funcs:
- preempt: To handle preemption of user mode queues
- restore: To restore preempted user mode queues
These callbacks will allow the driver to properly manage queue
preemption and restoration when needed, such as durin
On Tue, 2025-09-02 at 08:45 +0200, Christian König wrote:
> On 01.09.25 12:00, Timur Kristóf wrote:
> > SDMA v3-v5 can copy almost 4 MiB in a single copy operation.
> > Use the same value as PAL and Mesa for copy_max_bytes.
> >
> > For reference, see oss2DmaCmdBuffer.cpp in PAL:
> > "Due t
On Tue, 2025-09-02 at 08:39 +0200, Christian König wrote:
> On 01.09.25 12:00, Timur Kristóf wrote:
> > Technically not necessary, but clear the extra dwords too,
> > so that the command processors don't read uninitialized memory.
>
> That is most likely a really bad idea.
>
> The extra DWs are f
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Ce Sun
Best Regards,
Sun,Ce
From: Lazar, Lijo
Sent: Tuesday, September 2, 2025 2:22 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Deucher, Alexander
; Sun, Ce(Overlord)
Sub
On 02.09.25 05:29, Srinivasan Shanmugam wrote:
> Enable userspace to obtain a handle to the kernel-owned MMIO_REMAP
> singleton when AMDGPU_GEM_DOMAIN_MMIO_REMAP is requested via
> amdgpu_gem_create_ioctl().
>
> Validate the fixed 4K constraint: if PAGE_SIZE > AMDGPU_GPU_PAGE_SIZE
> return -EINVAL
On 02.09.25 05:29, Srinivasan Shanmugam wrote:
> Add mmio_remap bookkeeping to amdgpu_device and introduce
> amdgpu_ttm_mmio_remap_bo_init()/fini() to manage a kernel-owned,
> one-page (4K) BO in AMDGPU_GEM_DOMAIN_MMIO_REMAP.
>
> Bookkeeping:
> - adev->rmmio_remap.bo : kernel-owned singleton BO
65 matches
Mail list logo