rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c

sashiko-bot Sun, 21 Jun 2026 19:28:01 -0700

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) 
to consider:


Pre-existing issues:
- [Critical] Missing lower and relative bounds validation for GSP RPC message 
lengths leads to multiple severe memory corruption vulnerabilities, including 
out-of-bounds reads, massive integer/buffer underflows, and kernel heap 
overflows.
- [Critical] Incorrect destination offset calculation when receiving a 
wrap-around continuation element leaves a 32-byte gap in the payload and causes 
a 32-byte heap buffer overflow.
- [High] Returning `NULL` instead of an `ERR_PTR` on abnormally large 
`rpc->length` causes silent failure propagation, leading to NULL pointer 
dereferences in caller functions and queue deadlocks.
- [Medium] Large payload requests are leaked on error paths in 
`r535_gsp_rpc_push()`.
--

commit 849e62a17ca943158f16f67ef2b39000a10c58b6
Author: Hongling Zeng <[email protected]>

nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c

Clean up incorrect IS_ERR_OR_NULL() usage for functions that never
return NULL:
- r535_gsp_msgq_peek() never returns NULL
- r535_gsp_msgq_recv_one_elem() never returns NULL

These functions should be checked with IS_ERR() instead.

> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> index 35272c1538699..25d3adbc13550 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> @@ -350,7 +350,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, 
> int *retries)
>               u32 size;
>  
>               rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), info.retries);
> -             if (IS_ERR_OR_NULL(rpc)) {
> +             if (IS_ERR(rpc)) {

[Severity: High]
This is a pre-existing issue, but does r535_gsp_msgq_recv() correctly handle
returning NULL?

Earlier in r535_gsp_msgq_recv(), if the length exceeds the maximum, it returns
NULL:

    if (WARN_ON(rpc->length > max_rpc_size))
        return NULL;

When this NULL is propagated to callers like r535_gsp_rpc_rm_ctrl_push(), it
can translate to a 0 (success) return code because PTR_ERR(NULL) is 0. This
can lead to silent failure propagation and queue deadlocks. Would it be better
to return an ERR_PTR() instead?

[Severity: Critical]
This is also a pre-existing issue, but does rpc->length need a lower bounds
check?

In r535_gsp_msgq_recv(), the code verifies rpc->length <= max_rpc_size, but
doesn't verify if it is smaller than sizeof(struct nvfw_gsp_rpc). 

If the length is smaller than the struct size, calculations like 
len - sizeof(struct nvfw_gsp_rpc) in r535_gsp_msgq_recv_one_elem() can
underflow. Could this lead to an out-of-bounds read and massive buffer
underflows during memcpy?

>                       kvfree(buf);
>                       return rpc;
>               }
> @@ -359,7 +359,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, 
> int *retries)
>               info.continuation = true;
>  
>               rpc = r535_gsp_msgq_recv_one_elem(gsp, &info);
> -             if (IS_ERR_OR_NULL(rpc)) {
> +             if (IS_ERR(rpc)) {

[Severity: Critical]
This isn't a bug introduced by this patch, but does
r535_gsp_msgq_recv_one_elem() calculate the destination offset correctly for
continuation elements?

When info.continuation is true and the message wraps around the ring buffer,
the second memcpy() writes to buf + len:

r535_gsp_msgq_recv_one_elem() {
    ...
    if (expected) {
        mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000);
        memcpy(buf + len, mqe, expected);
    }
    ...
}

Since len isn't adjusted for the sizeof(struct nvfw_gsp_rpc) that was skipped
in the first memcpy, does this leave an uninitialized gap and cause a heap
buffer overflow by writing past the end of the kvmalloc-allocated buffer?

>                       kvfree(buf);
>                       return rpc;
>               }

[ ... ]

> @@ -499,7 +499,7 @@ r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 
> gsp_rpc_len)
>  
>  retry:
>       rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), &retries);
> -     if (IS_ERR_OR_NULL(rpc))
> +     if (IS_ERR(rpc))
>               return rpc;
>  
>       rpc = r535_gsp_msgq_recv(gsp, gsp_rpc_len, &retries);

[Severity: Medium]
While reviewing these changes, I noticed another pre-existing issue. Does
r535_gsp_rpc_push() leak the msg buffer on error paths?

In r535_gsp_rpc_push(), msg is explicitly freed on success:

r535_gsp_rpc_push() {
    ...
    if (!IS_ERR(repv))
        kvfree(msg);
    ...
}

However, if any intermediate chunk fails to send or if handling the reply
fails, the code jumps to the done: label and returns an error without freeing
msg. Should msg be freed on these error paths as well?

-- 
Sashiko AI review · 
https://sashiko.dev/#/patchset/[email protected]?part=2

Re: [PATCH v3 2/5] nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c

Reply via email to