Thank you for your contribution! Sashiko AI review found 1 potential issue(s) 
to consider:

Pre-existing issues:
- [High] Incomplete Fix / Kernel panic on module load error path: If 
`gpu_buddy_init()` fails, it cleans up its internal allocations but leaves 
`mm->n_roots > 0` and `mm->free_trees` and `mm->roots` as dangling or NULL 
pointers. Because `adev->mman.initialized` is set to `true` prior to calling 
`amdgpu_vram_mgr_init()`, a failure in `gpu_buddy_init()` unconditionally 
triggers the driver teardown path. `amdgpu_vram_mgr_fini()` then invokes 
`gpu_buddy_fini()`, which dereferences the invalid pointers and causes a kernel 
panic.
--

commit 88a159aad4fc080127cdbd3bc127dfaae0ae9c17
Author: Thomas Hellström <[email protected]>

drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()

This commit fixes kernel panics on the module load error path by moving
drmm_cgroup_register_region() after the list and buddy allocator are fully
initialized.

Closes: 
https://sashiko.dev/#/patchset/[email protected]?part=4

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 2a241a5b12c42..ac3f71d77140a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
[ ... ]
> @@ -935,6 +932,10 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev)
>       if (err)
>               return err;

[Severity: High]
This is a pre-existing issue, but if gpu_buddy_init() fails, does it still
leave the system vulnerable to the same module load error path panic?

Looking at drivers/gpu/buddy.c, if gpu_buddy_init() fails after setting
mm->n_roots, the error path cleans up allocations but leaves mm->n_roots
populated and mm->roots as a dangling or NULL pointer:

drivers/gpu/buddy.c:gpu_buddy_init() {
    ...
    mm->n_roots = hweight64(size);
    ...
out_free_tree:
    while (i--)
        kfree(mm->free_trees[i]);
    kfree(mm->free_trees);
    return -ENOMEM;
}

Since adev->mman.initialized is set to true before amdgpu_vram_mgr_init() is
called, returning this error will trigger the teardown path where
amdgpu_vram_mgr_fini() unconditionally calls gpu_buddy_fini():

drivers/gpu/buddy.c:gpu_buddy_fini() {
    ...
    for (i = 0; i < mm->n_roots; ++i) {
        start = gpu_buddy_block_offset(mm->roots[i]);
    ...
}

Because mm->n_roots is non-zero, will this unconditionally dereference the
invalid mm->roots pointer and cause a panic?

> +     man->cg = drmm_cgroup_register_region(adev_to_drm(adev), "vram", 
> adev->gmc.real_vram_size);
> +     if (IS_ERR(man->cg))
> +             return PTR_ERR(man->cg);
> +
>       ttm_set_driver_manager(&adev->mman.bdev, TTM_PL_VRAM, &mgr->manager);
>       ttm_resource_manager_set_used(man, true);
>       return 0;

-- 
Sashiko AI review · 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to