Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues: - [High] Early returns in `receive_msg` skip advancing the GSP read pointer, leading to a permanently stalled command queue. - [High] CPU->GSP messaging path relies on a fragile control dependency for memory ordering instead of an explicit DMA barrier. -- commit 4410cc89c9104f4a90dab2676e5da7ce4fd1a204 Author: Gary Guo <[email protected]> gpu: nova-core: fix barrier usage in GSP->CPU messaging path This commit replaces the Rust SeqCst barrier with explicit dma_mb(Read) and dma_mb(Full) barriers in the GSP->CPU messaging path. The SeqCst barrier does not correctly order DMA operations on architectures like ARM, so using the proper dma_mb() ensures correct LOAD->LOAD and LOAD->STORE ordering. > diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs > b/drivers/gpu/nova-core/gsp/cmdq.rs > index 94c2790d943d0..229d49a670779 100644 > --- a/drivers/gpu/nova-core/gsp/cmdq.rs > +++ b/drivers/gpu/nova-core/gsp/cmdq.rs [ ... ] > @@ -339,6 +341,9 @@ fn driver_write_area_size(&self) -> usize { > (MSGQ_NUM_PAGES, tx) > }; > > + // ORDERING: Ensure data load is ordered after load of GSP write > pointer. > + dma_mb(Read); > + [Severity: High] This is a pre-existing issue, but since we are updating memory barriers here, does the CPU->GSP messaging path in driver_write_area() also need an explicit barrier instead of relying on a fragile control dependency? drivers/gpu/nova-core/gsp/cmdq.rs:driver_write_area() { ... let (tail_end, wrap_end) = if rx == 0 { ... } else if rx <= tx { ... } else { ... }; // ORDERING: control dependency provides necessary LOAD->STORE ordering. // dma_mb(Full) may be used here if we don't want to rely on control dependency. ... } Relying on control dependencies for DMA memory ordering is unsafe in Rust, as the compiler does not guarantee preserving them and may optimize away branches (e.g., using conditional selects like csel on ARM64). Control dependencies also do not emit architectural barriers (like dmb osh on ARM64), which are necessary to order memory accesses across the system interconnect for device memory. If the CPU's store to the command buffer is observed by the GSP before the CPU's load of rx, could the CPU overwrite data the GSP has not yet consumed? Should an explicit dma_mb(Full) be added here to guarantee LOAD->STORE ordering? > // SAFETY: > // - `data` was created from a valid pointer, and `rx` and `tx` are > in the > // `0..MSGQ_NUM_PAGES` range per the invariants of `gsp_write_ptr` > and `cpu_read_ptr`, [ ... ] > @@ -436,6 +441,9 @@ fn cpu_read_ptr(&self) -> u32 { > > // Informs the GSP that it can send `elem_count` new pages into the > message queue. > fn advance_cpu_read_ptr(&mut self, elem_count: u32) { > + // ORDERING: Ensure read pointer is properly ordered with a > LOAD->STORE ordering. > + dma_mb(Full); > + > super::fw::gsp_mem::advance_cpu_read_ptr(&self.0, elem_count) > } [Severity: High] This is a pre-existing issue, but looking at how advance_cpu_read_ptr() is used, do the early returns in receive_msg() skip advancing the GSP read pointer? drivers/gpu/nova-core/gsp/cmdq.rs:receive_msg() { ... let function = message.header.function().map_err(|_| EINVAL)?; // Extract the message. Store the result as we want to advance the read pointer even in // case of failure. let result = if function == M::FUNCTION { let (cmd, contents_1) = M::Message::from_bytes_prefix(message.contents.0).ok_or(EIO)?; ... // Advance the read pointer past this message. self.gsp_mem.advance_cpu_read_ptr(...); ... } The comments note that the result is stored to ensure the read pointer is advanced even on failure. However, the use of the ? operator on message.header.function() and M::Message::from_bytes_prefix() causes an immediate return. If an early return happens here, will subsequent polls read the exact same malformed message? Could this cause repeated failures and a permanently stalled command queue? -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=4
