On 6/26/26 12:04, Philipp Stanner wrote:
> The commit mentioned in the fixes tag below introduced a mechanism
> through which fence producers can fully decouple from fence consumers.
> This, desirable, mechanism is based on the fence's signaled-bit as the
> "decoupling point".
> 
> A sophisticated interaction between RCU and atomic instructions attempts
> to ensure that fence consumers can still interact with fence producers
> through the dma_fence_ops, callback pointers into the producer.
> 
> This is the desired behavior: to check for decoupling, the signaled-bit
> is first checked. If it's not yet signaled, RCU ensures that the ops
> pointer cannot yet be NULL.
> 
> Hereby, dma_fence_signal_timestamp_locked() first sets the signaled-bit,
> and then sets the ops pointer to NULL. Readers first load the ops
> pointer, and then check through the signaled-bit whether the pointer can
> legally be accessed.
> 
> These set and load operations could occur out of order on weakly ordered
> platforms. Hence, we need to enforce strict ordering all the time.

Ah! Good catch, now I've got what you mean with that.

> 
> Add the appropriate memory barriers.
> 
> Cc: [email protected]
> Fixes: f4cc3ab824d6 ("dma-buf: protected fence ops by RCU v8")
> Signed-off-by: Philipp Stanner <[email protected]>
> ---
> Tested with dmabuf and drm_sched unit tests.
> 
> Memory barriers are notoriously difficult, so I would appreciate if some
> of the more experienced folks can check this. Notably, I am not sure
> whether the smp_wmb() is necessary.
> 
> The documentation for test_and_set_bit() makes the mysterious statement
> "This is an atomic fully-ordered operation (implied full memory
> barrier)", but the kcsan_mb() seems to be some sort of debugging
> barrier, and in any case the docu doesn't make it obvious to me whether
> that "full barrier" comes before or after the bit setting takes place.
> 
> Moreover, in my opinion we should order dma_fence_is_signaled(), too –
> but if we agree to merge Christian's new series [1] that need should
> disappear.
> 
> 
> [1] 
> https://lore.kernel.org/dri-devel/[email protected]/
>  
> ---
>  drivers/dma-buf/dma-fence.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index c7ea1e75d38a..2e80b01499de 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -363,6 +363,18 @@ void dma_fence_signal_timestamp_locked(struct dma_fence 
> *fence,
>                                     &fence->flags)))
>               return;
>  
> +     /*
> +      * Fully order setting of the bit above with setting of the ops pointer
> +      * to NULL below, so that all parties can use the signaled flag to
> +      * detect that the fence decoupled from its ops in a safe manner.
> +      *
> +      * The counter parts of this barrier are in dma_fence_timeline_name()
> +      * and dma_fence_driver_name(). All other future parties that rely on
> +      * the signaled flag for valid access to the ops pointer will need a
> +      * memory barrier.
> +      */
> +     smp_wmb();
> +
>       trace_dma_fence_signaled(fence);
>  
>       /*
> @@ -1170,6 +1182,12 @@ const char __rcu *dma_fence_driver_name(struct 
> dma_fence *fence)
>  
>       /* RCU protection is required for safe access to returned string */
>       ops = rcu_dereference(fence->ops);
> +     /*
> +      * Fully order the dereference above with the flag check. Otherwise,
> +      * ops could be dereferenced as a NULL pointer. The barrier's
> +      * counterpart is in dma_fence_signal_timestamp_locked().
> +      */
> +     smp_rmb();
>       if (!dma_fence_test_signaled_flag(fence))

Instead of adding the smp_rmb() I think we should check the ops pointer here 
for consistency with the other cases where we call the ops functions.

>               return (const char __rcu *)ops->get_driver_name(fence);
>       else
> @@ -1203,6 +1221,12 @@ const char __rcu *dma_fence_timeline_name(struct 
> dma_fence *fence)
>  
>       /* RCU protection is required for safe access to returned string */
>       ops = rcu_dereference(fence->ops);
> +     /*
> +      * Fully order the dereference above with the flag check. Otherwise,
> +      * ops could be dereferenced as a NULL pointer. The barrier's
> +      * counterpart is in dma_fence_signal_timestamp_locked().
> +      */
> +     smp_rmb();
>       if (!dma_fence_test_signaled_flag(fence))

Same of course here as well.

Thanks,
Christian.

>               return (const char __rcu *)ops->get_driver_name(fence);
>       else
> 
> base-commit: cdeb2ccd993ed8647adbbda2c3b103aa717fd6f7

Reply via email to