RE: [PATCH] drm/amdgpu: notify amdgpu gpu reset state via uevent

2025-09-25 Thread Lazar, Lijo
[Public] Presently, there is this one also - drm_dev_wedged_event. Perhaps it's better to modify this to include additional info like pre and post reset along with cause of reset? Thanks, Lijo -Original Message- From: amd-gfx On Behalf Of Yang Wang Sent: Friday, September 26, 2025 12:0

RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw

2025-09-25 Thread Lazar, Lijo
[Public] The intention is to let kgd2kfd_interrupt thread know that KFD is done with interrupt handling and exit at the earliest (that is even without going through kfd node loop). I was thinking of checking ih_wq NULL value, but since that value is not under lock, it's not necessary that kgd2k

[PATCH] drm/amdgpu: notify amdgpu gpu reset state via uevent

2025-09-25 Thread Yang Wang
Use the uevent mechanism to expose the GPU reset state, so that the system tool can more accurately monitor the device reset status. example: $ sudo cat /sys/kernel/debug/dri//amdgpu_gpu_recover KERNEL[172.053149] change /devices/pci:00/:00:03.1/:03:00.0/:04:00.0/:05:00.0 (

Re: [PATCH v7 1/3] drm/buddy: Optimize free block management with RB tree

2025-09-25 Thread Arunpravin Paneer Selvam
Hi Matthew, Ping ? Regards, Arun. On 9/23/2025 2:32 PM, Arunpravin Paneer Selvam wrote: Replace the freelist (O(n)) used for free block management with a red-black tree, providing more efficient O(log n) search, insert, and delete operations. This improves scalability and performance when mana

Re: [PATCH V4 14/18] amdkfd: record kfd process id into kfd process_info

2025-09-25 Thread Zhu, Lingshan
On 9/25/2025 5:45 AM, Kuehling, Felix wrote: > On 2025-09-23 03:26, Zhu Lingshan wrote: >> This commit records the id of the owner >> kfd_process into a kfd process_info when >> create it. >> >> Signed-off-by: Zhu Lingshan >> --- >>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h   | 2 ++ >>   d

Re: [PATCH V4 08/18] amdkfd: identify a secondary kfd process by its id

2025-09-25 Thread Zhu, Lingshan
On 9/25/2025 5:41 AM, Kuehling, Felix wrote: > On 2025-09-23 03:25, Zhu Lingshan wrote: >> This commit introduces a new id field for >> struct kfd process, which helps identify >> a kfd process among multiple contexts that >> all belong to a single user space program. >> >> The sysfs entry of a se

Re: [PATCH V4 17/18] amdkfd: set_debug_trap ioctl only works on a primary kfd_process target

2025-09-25 Thread Zhu, Lingshan
On 9/25/2025 5:50 AM, Kuehling, Felix wrote: > On 2025-09-23 03:26, Zhu Lingshan wrote: >> The user space program pass down a pid to kfd >> through set_debug_trap ioctl, which can help >> find the corresponding user space program and >> its mm struct. >> >> However, these information is insufficie

RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw

2025-09-25 Thread Zhang, Yifan
[AMD Official Use Only - AMD Internal Distribution Only] flush_workqueue(kfd->ih_wq) and destroy_workqueue(kfd->ih_wq) in kfd_cleanup_nodes clean up pending work items, and node->interrupts_active check prevent new work items from being enqueued. So after kfd_cleanup_nodes free kfd node, there

Re: [PATCH v2 1/3] PM: hibernate: Fix hybrid-sleep

2025-09-25 Thread Rafael J. Wysocki
On Thu, Sep 25, 2025 at 5:59 PM Mario Limonciello (AMD) wrote: > > Hybrid sleep will hibernate the system followed by running through > the suspend routine. Since both the hibernate and the suspend routine > will call pm_restrict_gfp_mask(), pm_restore_gfp_mask() must be called > before starting

[v6 1/2] drm/amdgpu: Convert amdgpu userqueue management from IDR to XArray

2025-09-25 Thread Jesse . Zhang
This commit refactors the AMDGPU userqueue management subsystem to replace IDR (ID Allocation) with XArray for improved performance, scalability, and maintainability. The changes address several issues with the previous IDR implementation and provide better locking semantics. Key changes: 1. **Gl

[Patch v1] drm/amdgpu: use user provided hmm_range buffer in amdgpu_ttm_tt_get_user_pages

2025-09-25 Thread Sunil Khatri
update the amdgpu_ttm_tt_get_user_pages and all dependent function along with it callers to use a user allocated hmm_range buffer instead hmm layer allocates the buffer. This is a need to get hmm_range pointers easily accessible without accessing the bo and that is a requirement for the userqueue

Re: [PATCH] drm/amd: Check whether secure display TA loaded successfully

2025-09-25 Thread Mario Limonciello
On 9/25/2025 4:16 PM, Alex Deucher wrote: On Thu, Sep 25, 2025 at 3:50 PM Mario Limonciello wrote: On 9/25/2025 2:46 PM, Alex Deucher wrote: On Thu, Sep 25, 2025 at 3:39 PM Mario Limonciello wrote: [Why] Not all renoir hardware supports secure display. If the TA is present but the fe

Re: [PATCH v2 1/3] PM: hibernate: Fix hybrid-sleep

2025-09-25 Thread Mario Limonciello (AMD) (kernel.org)
On 9/25/2025 12:55 PM, Rafael J. Wysocki wrote: On Thu, Sep 25, 2025 at 7:51 PM Rafael J. Wysocki wrote: On Thu, Sep 25, 2025 at 7:47 PM Rafael J. Wysocki wrote: On Thu, Sep 25, 2025 at 5:59 PM Mario Limonciello (AMD) wrote: Hybrid sleep will hibernate the system followed by running t

Re: [PATCH 2/5] drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIs

2025-09-25 Thread Timur Kristóf
Alex Deucher ezt írta (időpont: 2025. szept. 25., Csü 23:28): > On Thu, Sep 25, 2025 at 2:45 PM Timur Kristóf > wrote: > > > > Without these, it's impossible to program these registers. > > > > Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling > Horizontal Filter Init (v2)") > >

Re: [PATCH] drm/amd: Check whether secure display TA loaded successfully

2025-09-25 Thread Alex Deucher
On Thu, Sep 25, 2025 at 5:47 PM Mario Limonciello wrote: > > > > On 9/25/2025 4:16 PM, Alex Deucher wrote: > > On Thu, Sep 25, 2025 at 3:50 PM Mario Limonciello > > wrote: > >> > >> > >> > >> On 9/25/2025 2:46 PM, Alex Deucher wrote: > >>> On Thu, Sep 25, 2025 at 3:39 PM Mario Limonciello > >>>

Re: [PATCH 2/5] drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIs

2025-09-25 Thread Alex Deucher
On Thu, Sep 25, 2025 at 5:33 PM Timur Kristóf wrote: > > > > Alex Deucher ezt írta (időpont: 2025. szept. 25., Csü > 23:28): >> >> On Thu, Sep 25, 2025 at 2:45 PM Timur Kristóf >> wrote: >> > >> > Without these, it's impossible to program these registers. >> > >> > Fixes: 102b2f587ac8 ("drm/am

Re: [PATCH 2/5] drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIs

2025-09-25 Thread Alex Deucher
On Thu, Sep 25, 2025 at 2:45 PM Timur Kristóf wrote: > > Without these, it's impossible to program these registers. > > Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal > Filter Init (v2)") > Signed-off-by: Timur Kristóf I think it would make sense to just squash pa

Re: [PATCH] drm/amd: Check whether secure display TA loaded successfully

2025-09-25 Thread Alex Deucher
On Thu, Sep 25, 2025 at 3:50 PM Mario Limonciello wrote: > > > > On 9/25/2025 2:46 PM, Alex Deucher wrote: > > On Thu, Sep 25, 2025 at 3:39 PM Mario Limonciello > > wrote: > >> > >> [Why] > >> Not all renoir hardware supports secure display. If the TA is present > >> but the feature isn't suppor

Re: [PATCH] drm/amd: Check whether secure display TA loaded successfully

2025-09-25 Thread Mario Limonciello
On 9/25/2025 2:46 PM, Alex Deucher wrote: On Thu, Sep 25, 2025 at 3:39 PM Mario Limonciello wrote: [Why] Not all renoir hardware supports secure display. If the TA is present but the feature isn't supported it will fail to load or send commands. This shows ERR messages to the user that mak

Re: [PATCH] drm/amd: Check whether secure display TA loaded successfully

2025-09-25 Thread Alex Deucher
On Thu, Sep 25, 2025 at 3:39 PM Mario Limonciello wrote: > > [Why] > Not all renoir hardware supports secure display. If the TA is present > but the feature isn't supported it will fail to load or send commands. > This shows ERR messages to the user that make it seems like there is > a problem. >

Re: [PATCH v3 0/3] Fixes for hybrid sleep

2025-09-25 Thread Rafael J. Wysocki
On Thu, Sep 25, 2025 at 8:51 PM Mario Limonciello (AMD) wrote: > > From: Mario Limonciello > > Ionut Nechita reported recently a hibernate failure, but in debugging > the issue it's actually not a hibernate failure; but a hybrid sleep > failure. > > Multiple changes related to the change of when

[PATCH] drm/radeon: Solve the problem of the audio options not disappearing promptly after unplugging the HDMI audio.

2025-09-25 Thread 2564278112
From: Wang Jiang The audio detection process in the Radeon driver is as follows: radeon_dvi_detect/radeon_dp_detect -> radeon_audio_detect -> radeon_audio_enable -> radeon_audio_component_notify -> radeon_audio_component_get_eld When HDMI is unplugged, radeon_dvi_detect is triggered. At this po

[PATCH v3 0/3] Fixes for hybrid sleep

2025-09-25 Thread Mario Limonciello (AMD)
From: Mario Limonciello Ionut Nechita reported recently a hibernate failure, but in debugging the issue it's actually not a hibernate failure; but a hybrid sleep failure. Multiple changes related to the change of when swap is disabled in the suspend sequence contribute to the failure. See the i

[PATCH] drm/amd: Check whether secure display TA loaded successfully

2025-09-25 Thread Mario Limonciello
[Why] Not all renoir hardware supports secure display. If the TA is present but the feature isn't supported it will fail to load or send commands. This shows ERR messages to the user that make it seems like there is a problem. [How] Check the resp_status of the context to see if there was an erro

Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw

2025-09-25 Thread Chen, Xiaogang
On 9/24/2025 5:48 PM, Philip Yang wrote: On 2025-09-24 11:29, Yifan Zhang wrote: There is race in amdgpu_amdkfd_device_fini_sw and interrupt. if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and    kfree(kfd), and KGD interrupt generated. kernel panic log: BUG: kernel NULL point

[PATCH 0/5] DC: Properly disable scaling on DCE6

2025-09-25 Thread Timur Kristóf
This series fixes visual glitches on systems with SI GPUs where the BIOS sets up a default mode with scaling. Alex was kind enough to give me an extra register definition that can actually bypass the scaler on DCE6. Additionally, while testing the scaler under KDE, I noticed that it doesn't work w

[PATCH 5/5] drm/amd/display: Disable scaling on DCE6 for now

2025-09-25 Thread Timur Kristóf
Scaling doesn't work on DCE6 at the moment, the current register programming produces incorrect output when using fractional scaling (between 100-200%) on resolutions higher than 1080p. Disable it until we figure out how to program it properly. Fixes: 7c15fd86aaec ("drm/amd/display: dc/dce: add i

[PATCH v3 3/3] drm/amd: Fix hybrid sleep

2025-09-25 Thread Mario Limonciello (AMD)
[Why] commit 530694f54dd5e ("drm/amdgpu: do not resume device in thaw for normal hibernation") optimized the flow for systems that are going into S4 where the power would be turned off. Basically the thaw() callback wouldn't resume the device if the hibernation image was successfully created since

[PATCH v3 2/3] PM: hibernate: Add pm_hibernation_mode_is_suspend()

2025-09-25 Thread Mario Limonciello (AMD)
Some drivers have different flows for hibernation and suspend. If the driver opportunistically will skip thaw() then it needs a hint to know what is happening after the hibernate. Introduce a new symbol pm_hibernation_mode_is_suspend() that drivers can call to determine if suspending the system fo

[PATCH v3 1/3] PM: hibernate: Fix hybrid-sleep

2025-09-25 Thread Mario Limonciello (AMD)
Hybrid sleep will hibernate the system followed by running through the suspend routine. Since both the hibernate and the suspend routine will call pm_restrict_gfp_mask(), pm_restore_gfp_mask() must be called before starting the suspend sequence. Add an explicit call to pm_restore_gfp_mask() to po

[PATCH 4/5] drm/amd/display: Properly disable scaling on DCE6

2025-09-25 Thread Timur Kristóf
SCL_SCALER_ENABLE can be used to enable/disable the scaler on DCE6. Program it to 0 when scaling isn't used, 1 when used. Additionally, clear some other registers when scaling is disabled and program the SCL_UPDATE register as recommended. This fixes visible glitches for users whose BIOS sets up a

[PATCH 3/5] drm/amd/display: Properly clear SCL_*_FILTER_CONTROL on DCE6

2025-09-25 Thread Timur Kristóf
Previously, the code would set a bit field which didn't exist on DCE6 so it would be effectively a no-op. Fixes: b70aaf5586f2 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions") Signed-off-by: Timur Kristóf --- drivers/gpu/drm/amd/display/dc/dce/dce_transform.c | 6 ++ 1

[PATCH 2/5] drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIs

2025-09-25 Thread Timur Kristóf
Without these, it's impossible to program these registers. Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init (v2)") Signed-off-by: Timur Kristóf --- drivers/gpu/drm/amd/display/dc/dce/dce_transform.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/d

[PATCH 1/5] drm/amdgpu: Add additional DCE6 SCL registers

2025-09-25 Thread Timur Kristóf
From: Alex Deucher Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init (v2)") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h | 7 +++ drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h | 2 ++ 2 files

Re: [PATCH V11 06/47] drm/colorop: Add 1D Curve subtype

2025-09-25 Thread Harry Wentland
On 2025-09-25 04:11, Pekka Paalanen wrote: On Tue, 23 Sep 2025 11:41:24 -0600 Alex Hung wrote: On 9/23/25 10:16, Alex Hung wrote: On 9/23/25 01:59, Pekka Paalanen wrote: On Mon, 22 Sep 2025 21:16:45 -0600 Alex Hung wrote: On 9/18/25 02:40, Pekka Paalanen wrote: ... The problem

Re: [PATCH v2 1/3] PM: hibernate: Fix hybrid-sleep

2025-09-25 Thread Rafael J. Wysocki
On Thu, Sep 25, 2025 at 7:51 PM Rafael J. Wysocki wrote: > > On Thu, Sep 25, 2025 at 7:47 PM Rafael J. Wysocki wrote: > > > > On Thu, Sep 25, 2025 at 5:59 PM Mario Limonciello (AMD) > > wrote: > > > > > > Hybrid sleep will hibernate the system followed by running through > > > the suspend routin

Re: [PATCH v2 1/3] PM: hibernate: Fix hybrid-sleep

2025-09-25 Thread Rafael J. Wysocki
On Thu, Sep 25, 2025 at 7:47 PM Rafael J. Wysocki wrote: > > On Thu, Sep 25, 2025 at 5:59 PM Mario Limonciello (AMD) > wrote: > > > > Hybrid sleep will hibernate the system followed by running through > > the suspend routine. Since both the hibernate and the suspend routine > > will call pm_rest

Re: [RFC v8 07/12] drm/sched: Account entity GPU time

2025-09-25 Thread Tvrtko Ursulin
On 24/09/2025 10:11, Philipp Stanner wrote: On Wed, 2025-09-03 at 11:18 +0100, Tvrtko Ursulin wrote: To implement fair scheduling we need a view into the GPU time consumed by entities. Problem we have is that jobs and entities objects have decoupled lifetimes, where at the point we have a view

Re: [PATCH 1/3] PM: hibernate: Fix hybrid-sleep

2025-09-25 Thread kernel test robot
patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Mario-Limonciello-AMD/PM-hibernate-Fix-hybrid-sleep/20250925-045432 base: https://git.kernel.org/pub/scm/linux/

Re: [PATCH] drm/amdgpu: Fix for GPU reset being blocked by KIQ I/O.

2025-09-25 Thread Philipp Stanner
On Thu, 2025-09-25 at 17:43 +0800, Heng Zhou wrote: > There is some probability that reset workqueue is blocked by KIQ I/O for 10+ > seconds after gpu hangs. > So we need to add a in_reset check during each KIQ register poll. > > Signed-off-by: Heng Zhou > --- You should create such patches wit

[PATCH 18/19 v6.1.y] minmax.h: simplify the variants of clamp()

2025-09-25 Thread Eliav Farber
From: David Laight [ Upstream commit 495bba17cdf95e9703af1b8ef773c55ef0dfe703 ] Always pass a 'type' through to __clamp_once(), pass '__auto_type' from clamp() itself. The expansion of __types_ok3() is reasonable so it isn't worth the added complexity of avoiding it when a fixed type is used fo

Re: [PATCH v2 0/3] Fixes for hybrid sleep

2025-09-25 Thread Rafael J. Wysocki
On Thu, Sep 25, 2025 at 5:59 PM Mario Limonciello (AMD) wrote: > > Ionut Nechita reported recently a hibernate failure, but in debugging > the issue it's actually not a hibernate failure; but a hybrid sleep > failure. > > Multiple changes related to the change of when swap is disabled in > the sus

[PATCH v2 1/3] PM: hibernate: Fix hybrid-sleep

2025-09-25 Thread Mario Limonciello (AMD)
Hybrid sleep will hibernate the system followed by running through the suspend routine. Since both the hibernate and the suspend routine will call pm_restrict_gfp_mask(), pm_restore_gfp_mask() must be called before starting the suspend sequence. Add an explicit call to pm_restore_gfp_mask() to po

[PATCH v2 2/3] PM: hibernate: Add pm_hibernation_mode_is_suspend()

2025-09-25 Thread Mario Limonciello (AMD)
Some drivers have different flows for hibernation and suspend. If the driver opportunistically will skip thaw() then it needs a hint to know what is happening after the hibernate. Introduce a new symbol pm_hibernation_mode_is_suspend() that drivers can call to determine if suspending the system fo

[PATCH v2 0/3] Fixes for hybrid sleep

2025-09-25 Thread Mario Limonciello (AMD)
Ionut Nechita reported recently a hibernate failure, but in debugging the issue it's actually not a hibernate failure; but a hybrid sleep failure. Multiple changes related to the change of when swap is disabled in the suspend sequence contribute to the failure. See the individual patches for deta

[PATCH] drm/amdgpu: Fix for GPU reset being blocked by KIQ I/O.

2025-09-25 Thread Heng Zhou
There is some probability that reset workqueue is blocked by KIQ I/O for 10+ seconds after gpu hangs. So we need to add a in_reset check during each KIQ register poll. Signed-off-by: Heng Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/

Re: [RFC v8 12/12] drm/sched: Embed run queue singleton into the scheduler

2025-09-25 Thread Tvrtko Ursulin
On 24/09/2025 13:01, Philipp Stanner wrote: On Wed, 2025-09-03 at 11:18 +0100, Tvrtko Ursulin wrote: Now that the run queue to scheduler relationship is always 1:1 we can embed it (the run queue) directly in the scheduler struct and save on some allocation error handling code and such. Looks

Re: [PATCH 2/3] PM: hibernate: Add pm_hibernation_mode_is_suspend()

2025-09-25 Thread kernel test robot
submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Mario-Limonciello-AMD/PM-hibernate-Fix-hybrid-sleep/20250925-045432 base: https://git.kernel.org/pub/scm/li

Re: [PATCH 0/3] DC: Reject too high pixel clocks on DCE6-10

2025-09-25 Thread Mario Limonciello
On 9/24/2025 6:38 AM, Timur Kristóf wrote: Reject modes with a pixel clock higher than the maximum display clock. These were never supported, but we haven't noticed the issue until the YCbCr 422 fallback was recently added. For example, the DP 1.2 standard technically supports 4K 120Hz YCbCr 422

RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw

2025-09-25 Thread Lazar, Lijo
[Public] I meant something like this. destroy_workqueue(kfd->ih_wq); kfd->ih_wq = NULL; Then check for NULL at the beginning of kgd2kfd_interrupt. If there is no IH workqueue, then there is no interrupt handling capability. May also be within the loop. Not sure if that is really required; if s

Re: [RFC v8 11/12] drm/sched: Remove FIFO and RR and simplify to a single run queue

2025-09-25 Thread Tvrtko Ursulin
On 24/09/2025 12:50, Philipp Stanner wrote: On Wed, 2025-09-03 at 11:18 +0100, Tvrtko Ursulin wrote: If the new fair policy is at least as good as FIFO and we can afford to remove round-robin, we can simplify the scheduler code by making the scheduler to run queue relationship always 1:1 and r

Re: [RFC v8 08/12] drm/sched: Remove idle entity from tree

2025-09-25 Thread Tvrtko Ursulin
On 24/09/2025 10:15, Philipp Stanner wrote: On Thu, 2025-09-11 at 16:06 +0100, Tvrtko Ursulin wrote: On 11/09/2025 15:32, Philipp Stanner wrote: On Wed, 2025-09-03 at 11:18 +0100, Tvrtko Ursulin wrote: There is no need to keep entities with no jobs in the tree so lets remove it once the las

Re: [RFC v8 09/12] drm/sched: Add fair scheduling policy

2025-09-25 Thread Tvrtko Ursulin
On 24/09/2025 10:38, Philipp Stanner wrote: On Wed, 2025-09-03 at 11:18 +0100, Tvrtko Ursulin wrote: Fair scheduling policy is built upon the same concepts as the well known CFS kernel scheduler - entity run queue is sorted by the virtual GPU time consumed by entities in a way that the entity

Re: [PATCH] drm/amdgpu: Merge amdgpu_vm_set_pasid into amdgpu_vm_init

2025-09-25 Thread Christian König
On 25.09.25 12:32, Jesse.Zhang wrote: > As KFD no longer uses a separate PASID, the global > amdgpu_vm_set_pasid()function is no longer necessary. > Merge its functionality directly intoamdgpu_vm_init() to simplify code flow > and eliminate redundant locking. > > Suggested-by: Christian König >

[PATCH] drm/amdgpu: Merge amdgpu_vm_set_pasid into amdgpu_vm_init

2025-09-25 Thread Jesse . Zhang
As KFD no longer uses a separate PASID, the global amdgpu_vm_set_pasid()function is no longer necessary. Merge its functionality directly intoamdgpu_vm_init() to simplify code flow and eliminate redundant locking. Suggested-by: Christian König Signed-off-by: Jesse Zhang --- drivers/gpu/drm/am

RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw

2025-09-25 Thread Zhang, Yifan
[Public] Hi Lijo, Do you mean a change like below ? Besides readability, is there functional improvement ? diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index e9cfb80bd436..86676acd9cbe 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/d

[v7 2/2] drm/amdgpu: Implement user queue reset functionality

2025-09-25 Thread Jesse . Zhang
This patch adds robust reset handling for user queues (userq) to improve recovery from queue failures. The key components include: 1. Queue detection and reset logic: - amdgpu_userq_detect_and_reset_queues() identifies failed queues - Per-IP detect_and_reset callbacks for targeted recovery

[PATCH 05/19 v6.1.y] minmax: avoid overly complicated constant expressions in VM code

2025-09-25 Thread Eliav Farber
From: Linus Torvalds [ Upstream commit 3a7e02c040b130b5545e4b115aada7bacd80a2b6 ] The minmax infrastructure is overkill for simple constants, and can cause huge expansions because those simple constants are then used by other things. For example, 'pageblock_order' is a core VM constant, but bec

[PATCH 06/19 v6.1.y] minmax: simplify and clarify min_t()/max_t() implementation

2025-09-25 Thread Eliav Farber
From: Linus Torvalds [ Upstream commit 017fa3e89187848fd056af757769c9e66ac3e93d ] This simplifies the min_t() and max_t() macros by no longer making them work in the context of a C constant expression. That means that you can no longer use them for static initializers or for array sizes in type

[PATCH 07/19 v6.1.y] minmax: make generic MIN() and MAX() macros available everywhere

2025-09-25 Thread Eliav Farber
From: Linus Torvalds [ Upstream commit 1a251f52cfdc417c84411a056bc142cbd77baef4 ] This just standardizes the use of MIN() and MAX() macros, with the very traditional semantics. The goal is to use these for C constant expressions and for top-level / static initializers, and so be able to simplif

[PATCH 12/19 v6.1.y] minmax: fix up min3() and max3() too

2025-09-25 Thread Eliav Farber
From: Linus Torvalds [ Upstream commit 21b136cc63d2a9ddd60d4699552b69c214b32964 ] David Laight pointed out that we should deal with the min3() and max3() mess too, which still does excessive expansion. And our current macros are actually rather broken. In particular, the macros did this: #d

[PATCH 08/19 v6.1.y] minmax: add a few more MIN_T/MAX_T users

2025-09-25 Thread Eliav Farber
From: Linus Torvalds [ Upstream commit 4477b39c32fdc03363affef4b11d48391e6dc9ff ] Commit 3a7e02c040b1 ("minmax: avoid overly complicated constant expressions in VM code") added the simpler MIN_T/MAX_T macros in order to avoid some excessive expansion from the rather complicated regular min/max m

[PATCH 17/19 v6.1.y] minmax.h: move all the clamp() definitions after the min/max() ones

2025-09-25 Thread Eliav Farber
From: David Laight [ Upstream commit c3939872ee4a6b8bdcd0e813c66823b31e6e26f7 ] At some point the definitions for clamp() got added in the middle of the ones for min() and max(). Re-order the definitions so they are more sensibly grouped. Link: https://lkml.kernel.org/r/8bb285818e4846469121c8

[PATCH 10/19 v6.1.y] minmax: don't use max() in situations that want a C constant expression

2025-09-25 Thread Eliav Farber
From: Linus Torvalds [ Upstream commit cb04e8b1d2f24c4c2c92f7b7529031fc35a16fed ] We only had a couple of array[] declarations, and changing them to just use 'MAX()' instead of 'max()' fixes the issue. This will allow us to simplify our min/max macros enormously, since they can now unconditiona