Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:
Pre-existing issues: - [High] Missing validation of GSP RPC message lengths from firmware can lead to integer underflow and massive kernel memory corruption. - [High] Silent failure when rpc is NULL results in caller NULL pointer dereference. - [High] Buffer overflow and uninitialized memory gap when a continuation GSP RPC chunk wraps around the message queue ring buffer. - [High] Silent infinite stall of the GSP message queue when an oversized message is encountered. - [High] Missing read memory barrier (`rmb()`) between reading the shared memory write pointer and reading the message payload. - [High] Time-of-Check to Time-of-Use race condition on `rpc->length` bypasses bounds checks. - [High] Trusting the CPU read pointer (`rptr`) from shared memory leads to arbitrary memory access. -- commit 45236a2b4542b6f23a9a07cab6b6a3cd3588738d Author: Hongling Zeng <[email protected]> nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c Clean up incorrect IS_ERR_OR_NULL() usage for functions that never return NULL: - r535_gsp_msgq_peek() never returns NULL - r535_gsp_msgq_recv_one_elem() never returns NULL These functions should be checked with IS_ERR() instead. > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c > index 2ca6de4094e63..0a00c7d831c91 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c > @@ -350,7 +350,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, > int *retries) [Severity: High] This is a pre-existing issue, but earlier in r535_gsp_msgq_recv(), returning NULL when an oversized message is encountered seems to stall the queue: r535_gsp_msgq_recv() { ... if (WARN_ON(rpc->length > max_rpc_size)) return NULL; ... } Does returning NULL here without advancing the queue read pointer cause an unrecoverable infinite stall, since subsequent reads would encounter the exact same oversized message? [Severity: High] This isn't a bug introduced by this patch, but there also appears to be a Time-of-Check to Time-of-Use race condition on rpc->length: r535_gsp_msgq_recv() { ... if (WARN_ON(rpc->length > max_rpc_size)) return NULL; buf = kvmalloc(max_t(u32, rpc->length, expected), GFP_KERNEL); ... } Since rpc points directly to DMA coherent memory shared with the GSP, could the length be overwritten in shared memory between the check and its use in kvmalloc() or info.gsp_rpc_len, bypassing the validation? > u32 size; > > rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), info.retries); > - if (IS_ERR_OR_NULL(rpc)) { > + if (IS_ERR(rpc)) { > kvfree(buf); > return rpc; > } > @@ -359,7 +359,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, > int *retries) > info.continuation = true; > > rpc = r535_gsp_msgq_recv_one_elem(gsp, &info); [Severity: High] This is a pre-existing issue, but in r535_gsp_msgq_recv_one_elem() called here, is there a potential integer underflow? r535_gsp_msgq_recv_one_elem() { ... if (info->continuation) memcpy(buf, mqe->data + sizeof(struct nvfw_gsp_rpc), len - sizeof(struct nvfw_gsp_rpc)); ... } If the continuation message length read directly from shared memory is less than sizeof(struct nvfw_gsp_rpc) (32 bytes), does this calculation underflow and lead to a massive out-of-bounds read and write? [Severity: High] This is another pre-existing issue in r535_gsp_msgq_recv_one_elem(). Could a continuation chunk wrapping around the ring buffer cause an out-of-bounds write? r535_gsp_msgq_recv_one_elem() { ... if (expected) { mqe = (void *)((u8 *)gsp->shm.msgq.ptr + 0x1000 + 0 * 0x1000); memcpy(buf + len, mqe, expected); } ... } If the first copy only advanced the buffer logically by len - 32 due to skipping the header, would writing the second part at buf + len leave a 32-byte gap of uninitialized memory and overflow the end of the allocated buffer by 32 bytes? [Severity: High] This isn't a bug introduced by this patch, but does the underlying r535_gsp_msgq_wait() function lack a required memory barrier? r535_gsp_msgq_wait() { ... u32 wptr = *gsp->msgq.wptr; ... } When polling the shared memory write pointer, does the host CPU need an rmb() after observing the updated write pointer? Without it, could the CPU speculatively fetch stale payload data before the pointer update is fully observed? [Severity: High] This is a pre-existing issue, but does r535_gsp_msgq_get_entry() safely handle the CPU read pointer? r535_gsp_msgq_get_entry() { u32 rptr = *gsp->msgq.rptr; return (void *)((u8 *)gsp->shm.msgq.ptr + GSP_PAGE_SIZE + rptr * GSP_PAGE_SIZE); } Since gsp->msgq.rptr points to DMA coherent memory, could a compromised GSP modify this pointer to an arbitrarily large value and force the host to read out-of-bounds host memory? > - if (IS_ERR_OR_NULL(rpc)) { > + if (IS_ERR(rpc)) { > kvfree(buf); > return rpc; > } [ ... ] > @@ -499,7 +499,7 @@ r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 > gsp_rpc_len) > > retry: > rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), &retries); > - if (IS_ERR_OR_NULL(rpc)) > + if (IS_ERR(rpc)) > return rpc; > > rpc = r535_gsp_msgq_recv(gsp, gsp_rpc_len, &retries); [Severity: High] This isn't a bug introduced by this patch, but when r535_gsp_msgq_recv() returns NULL (for example, when encountering an oversized message), does this result in a silent failure upstream? The NULL propagates out to r535_gsp_rpc_rm_ctrl_push(), which handles it like this: r535_gsp_rpc_rm_ctrl_push() { ... if (IS_ERR_OR_NULL(rpc)) { *params = NULL; return PTR_ERR(rpc); } ... } Since PTR_ERR(NULL) is 0 (success), does this mask the failure and cause upstream callers to dereference a NULL pointer thinking the call succeeded? -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=2
