On 13.10.25 09:09, Lazar, Lijo wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > >> -----Original Message----- >> From: Zhang, Jesse(Jie) <[email protected]> >> Sent: Monday, October 13, 2025 11:25 AM >> To: Lazar, Lijo <[email protected]>; [email protected]; dri- >> [email protected] >> Cc: Deucher, Alexander <[email protected]>; Koenig, Christian >> <[email protected]>; Yang, Philip <[email protected]> >> Subject: RE: [PATCH] drm/ttm: Add NULL check in >> ttm_resource_manager_usage >> >> [AMD Official Use Only - AMD Internal Distribution Only] >> >>> -----Original Message----- >>> From: Lazar, Lijo <[email protected]> >>> Sent: Monday, October 13, 2025 12:37 PM >>> To: Zhang, Jesse(Jie) <[email protected]>; >>> [email protected]; [email protected] >>> Cc: Deucher, Alexander <[email protected]>; Koenig, Christian >>> <[email protected]>; Zhang, Jesse(Jie) <[email protected]>; >>> Yang, Philip <[email protected]>; Zhang, Jesse(Jie) >>> <[email protected]> >>> Subject: RE: [PATCH] drm/ttm: Add NULL check in >>> ttm_resource_manager_usage >>> >>> [AMD Official Use Only - AMD Internal Distribution Only] >>> >>> The specific issue of trace with amdgpu_mem_info_vram_used_show should >>> be fixed with this one - "drm/amdgpu: hide VRAM sysfs attributes on >>> GPUs without VRAM" >> Thanks @Lazar, Lijo, maybe we still can use this patch to fix this crash >> when >> calling AMDGPU_CS and query AMDGPU_INFO_VRAM_USAGE. >> or add check like the previous patch. >> > [lijo] > > Agree, there are indeed multiple places of ttm_resource_manager_usage call. > You may follow the same check as in the hide VRAM patch - > ttm_resource_manager_used - in case ttm doesn't take this change.
Yeah, agree. When the VRAM manager isn't initialized we shouldn't be calling any of its functions in the first place. Maybe it is a good idea to add something like "if (WARN_ON_ONCE(!man)) return 0;" to prevent the crashes and only get a nice warning into the system log. Regards, Christian. > > Thanks, > Lijo > >> Regards >> Jesse >> >> [ 911.954646] BUG: kernel NULL pointer dereference, address: >> 00000000000008f8 [ 911.962437] >> #PF: supervisor write access in kernel mode [ 912.007045] RIP: >> 0010:_raw_spin_lock+0x1e/0x40 [ 912.105151] >> amdttm_resource_manager_usage+0x1f/0x40 >> [amdttm] [ 912.111579] amdgpu_cs_parser_bos.isra.0+0x543/0x800 >> [amdgpu] >> >>> >>> Thanks, >>> Lijo >>>> -----Original Message----- >>>> From: amd-gfx <[email protected]> On Behalf Of >>>> Jesse.Zhang >>>> Sent: Monday, October 13, 2025 7:25 AM >>>> To: [email protected]; [email protected] >>>> Cc: Deucher, Alexander <[email protected]>; Koenig, Christian >>>> <[email protected]>; Zhang, Jesse(Jie) <[email protected]>; >>>> Yang, Philip <[email protected]>; Zhang, Jesse(Jie) >>>> <[email protected]> >>>> Subject: [PATCH] drm/ttm: Add NULL check in >>>> ttm_resource_manager_usage >>>> >>>> Add a NULL pointer check in ttm_resource_manager_usage() to prevent >>>> kernel NULL pointer dereferences when the function is called with an >>>> uninitialized resource manager. >>>> >>>> This fixes a kernel OOPS observed on APU devices where the VRAM >>>> resource manager is not fully initialized, but various sysfs and >>>> debug interfaces still attempt to query VRAM usage statistics. >>>> >>>> The crash backtrace showed: >>>> BUG: kernel NULL pointer dereference, address: 00000000000008f8 >>>> Call Trace: >>>> amdttm_resource_manager_usage+0x1f/0x40 [amdttm] >>>> amdgpu_mem_info_vram_used_show+0x1e/0x40 [amdgpu] >>>> dev_attr_show+0x1d/0x40 >>>> kernfs_seq_show+0x27/0x30 >>>> >>>> By returning 0 for NULL managers, we allow callers to safely query >>>> usage information even when the underlying resource manager is not >>>> available, which is the expected behavior for devices without >>>> dedicated VRAM like >>> APUs. >>>> >>>> Suggested-by: Philip Yang <[email protected]> >>>> Signed-off-by: Jesse Zhang <[email protected]> >>>> --- >>>> drivers/gpu/drm/ttm/ttm_resource.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c >>>> b/drivers/gpu/drm/ttm/ttm_resource.c >>>> index e2c82ad07eb4..e4d45f75e40a 100644 >>>> --- a/drivers/gpu/drm/ttm/ttm_resource.c >>>> +++ b/drivers/gpu/drm/ttm/ttm_resource.c >>>> @@ -587,6 +587,9 @@ uint64_t ttm_resource_manager_usage(struct >>>> ttm_resource_manager *man) { >>>> uint64_t usage; >>>> >>>> + if (!man) >>>> + return 0; >>>> + >>>> spin_lock(&man->bdev->lru_lock); >>>> usage = man->usage; >>>> spin_unlock(&man->bdev->lru_lock); >>>> -- >>>> 2.49.0 >>> >> >
