Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:

Pre-existing issues:
- [Critical] The GSP message queue receiver lacks bounds checking on 
continuation records, allowing a compromised GSP to overflow the host kernel 
heap.
- [High] `nvkm_gsp_rpc_push()` and `nvkm_gsp_rpc_rd()` can return `NULL` when 
expecting a reply (`RECV` policy), contrary to the commit message's claims, 
leading to NULL pointer dereferences in callers.
- [High] Memory leak of the original RPC payload (`msg`) in 
`r535_gsp_rpc_push()` when handling a large payload (`payload_size > 
max_payload_size`) and an error occurs.
--

commit 8fadcaf9fa12c03760c8ed228ffe647df6013918
Author: Hongling Zeng <[email protected]>

nouveau/gsp: Cleanup IS_ERR_OR_NULL in nvkm_gsp_rpc_rd()

nvkm_gsp_rpc_rd() calls nvkm_gsp_rpc_get() which never returns NULL,
only valid pointers or error pointers. Clean up by using IS_ERR()
instead, matching the actual return value contract.

> Note: nvkm_gsp_rpc_push() can return NULL when policy is NOWAIT or
> NOSEQ, but nvkm_gsp_rpc_rd() uses RECV policy which expects a reply.
> If the RPC fails to get a reply, it should return an error pointer,
> not NULL.

[Severity: High]
This is a pre-existing issue, but the commit message states that
nvkm_gsp_rpc_push() cannot return NULL when using the RECV policy. Is this
accurate?

Looking at the call chain, it appears NULL can be returned and propagate up.
For example, r535_gsp_msgq_recv() returns NULL when the payload length
exceeds max bounds:

    if (WARN_ON(rpc->length > max_rpc_size))
        return NULL;

This NULL propagates through r535_gsp_msg_recv() to
r535_gsp_rpc_handle_reply() and then to nvkm_gsp_rpc_push() before being
returned by nvkm_gsp_rpc_rd().

If callers like r535_gsp_get_static_info() only check IS_ERR(), won't they
subsequently dereference the NULL pointer?

r535_gsp_get_static_info() {
    ...
    rpc = nvkm_gsp_rpc_rd(gsp, NV_VGPU_MSG_FUNCTION_GET_GSP_STATIC_INFO, 
sizeof(*rpc));
    if (IS_ERR(rpc))
        return PTR_ERR(rpc);
    
    gsp->internal.client.object.handle = rpc->hInternalClient;
    ...
}

> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
> index 64fed208e4cf8..a061779e0107b 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h
> @@ -293,7 +293,7 @@ nvkm_gsp_rpc_rd(struct nvkm_gsp *gsp, u32 fn, u32 argc)
>  {
>       void *argv = nvkm_gsp_rpc_get(gsp, fn, argc);
>  
> -     if (IS_ERR_OR_NULL(argv))
> +     if (IS_ERR(argv))
>               return argv;
>  
>       return nvkm_gsp_rpc_push(gsp, argv, NVKM_GSP_RPC_REPLY_RECV, argc);

[Severity: High]
This isn't a bug introduced by this patch, but does nvkm_gsp_rpc_push() leak
memory on error paths for large payloads?

When processing multi-part chunks in r535_gsp_rpc_push(), the original
message buffer is conditionally freed only on success:

    if (IS_ERR(next)) {
        repv = next;
        goto done;
    }
...
    if (!IS_ERR(repv))
        kvfree(msg);
...
done:
    mutex_unlock(&gsp->cmdq.mutex);
    return repv;

If an error occurs, the function jumps to the done label and returns an
error pointer without freeing the msg buffer. Since nvkm_gsp_rpc_rd() expects
the push operation to consume and free the payload buffer, does this result
in a persistent memory leak?

[Severity: Critical]
This is a pre-existing issue, but does the GSP message queue receiver have a
bounds checking issue that could overflow the host kernel heap?

In r535_gsp_msgq_recv(), if the first message's length is larger than the
expected size, the subtraction underflows:

    info.gsp_rpc_buf += info.gsp_rpc_len;
    expected -= info.gsp_rpc_len;

The loop then continuously processes continuation records because expected is
now a large non-zero value. It peeks at the next message length and calls
r535_gsp_msgq_recv_one_elem(), which uses memcpy to blindly copy the data:

    if (info->continuation)
        memcpy(buf, mqe->data + sizeof(struct nvfw_gsp_rpc),
               len - sizeof(struct nvfw_gsp_rpc));
    else
        memcpy(buf, mqe->data, len);

Could a compromised GSP firmware exploit this to overwrite adjacent memory?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=5

Reply via email to