Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:

New issues:
- [High] Polling queue pointers to attempt mutual exclusion between the kernel 
and GSP creates a Time-of-Check to Time-of-Use (TOCTOU) race condition and 
incorrectly identifies active processing windows as idle.

Pre-existing issues:
- [High] `send_msg` does not validate that the packet length is within the FSP 
EMEM channel boundaries before writing.
- [High] `QUEUE_TAIL` is written before `QUEUE_HEAD` is reset, potentially 
exposing an invalid queue state to the FSP.
--

commit cbbcd901d8b26aba0c2ec257efff6284b3bf260b
Author: Eliot Courtney <[email protected]>

gpu: nova-core: fsp: try to enforce exclusive access to FSP channel
    
Currently, send_msg assumes that the channel to FSP is free to write
into. But, it might not be. Both the kernel driver and GSP communicate
with FSP. The way they should attempt to keep exclusive access to this
channel to FSP is by making sure they don't try to start writing if
there's pending data until the full round trip has finished.

> diff --git a/drivers/gpu/nova-core/falcon/fsp.rs 
> b/drivers/gpu/nova-core/falcon/fsp.rs
> index 21eaa8e261ce6..cdb476894e1ad 100644
> --- a/drivers/gpu/nova-core/falcon/fsp.rs
> +++ b/drivers/gpu/nova-core/falcon/fsp.rs

[ ... ]

> @@ -133,6 +153,9 @@ pub(crate) fn send_msg(&mut self, bar: Bar0<'_>, packet: 
> &[u8]) -> Result {
>              return Err(EINVAL);
>          }
>  
> +        // Try to make sure we have exclusive access to the FSP at this 
> point.
> +        self.wait_until_ready(bar)?;
> +
>          self.write_emem(bar, packet)?;

[Severity: High]
Does this introduce a time-of-check to time-of-use (TOCTOU) race condition?

The kernel and the GSP run asynchronously.

This lockless software polling approach does not appear to provide strict
mutual exclusion between the two processors.

Could both evaluate the idle condition (qhead == qtail && mhead == mtail)
simultaneously in wait_until_ready() and subsequently write to the shared
FSP EMEM channel at the same time, leading to hardware message corruption?

[Severity: High]
This isn't a bug introduced by this patch, but does this write operation lack
a bounds check against the FSP EMEM channel capacity?

There does not appear to be a check ensuring that packet.len() is within
FSP_EMEM_CHANNEL_0_SIZE (1024 bytes).

If a caller submits a larger packet, could write_emem() overwrite adjacent
hardware memory and potentially corrupt secure boot data or other channel
states?

>  
>          // Update queue pointers. TAIL points at the last DWORD written.
>          let tail_offset = u32::try_from(packet.len() - 4).map_err(|_| 
> EINVAL)?;
>          bar.write(
>              Array::at(0),
>              regs::NV_PFSP_QUEUE_TAIL::zeroed().with_address(tail_offset),
>          );
>          bar.write(
>              Array::at(0),
>              regs::NV_PFSP_QUEUE_HEAD::zeroed().with_address(0),
>          );

[Severity: High]
This is a pre-existing issue, but does writing QUEUE_TAIL before resetting
QUEUE_HEAD expose an invalid queue state to the hardware?

If QUEUE_HEAD contains a non-zero value from a previous transaction, writing
QUEUE_TAIL first could cause the hardware to briefly observe a state where
TAIL < HEAD.

Because the queue is not circular, could this trigger an immediate hardware
processing error before the kernel updates QUEUE_HEAD to 0?

>  
>          Ok(())
>      }

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=3

Reply via email to