On 6/26/26 12:04, Philipp Stanner wrote: > The commit mentioned in the fixes tag below introduced a mechanism > through which fence producers can fully decouple from fence consumers. > This, desirable, mechanism is based on the fence's signaled-bit as the > "decoupling point". > > A sophisticated interaction between RCU and atomic instructions attempts > to ensure that fence consumers can still interact with fence producers > through the dma_fence_ops, callback pointers into the producer. > > This is the desired behavior: to check for decoupling, the signaled-bit > is first checked. If it's not yet signaled, RCU ensures that the ops > pointer cannot yet be NULL. > > Hereby, dma_fence_signal_timestamp_locked() first sets the signaled-bit, > and then sets the ops pointer to NULL. Readers first load the ops > pointer, and then check through the signaled-bit whether the pointer can > legally be accessed. > > These set and load operations could occur out of order on weakly ordered > platforms. Hence, we need to enforce strict ordering all the time.
Ah! Good catch, now I've got what you mean with that. > > Add the appropriate memory barriers. > > Cc: [email protected] > Fixes: f4cc3ab824d6 ("dma-buf: protected fence ops by RCU v8") > Signed-off-by: Philipp Stanner <[email protected]> > --- > Tested with dmabuf and drm_sched unit tests. > > Memory barriers are notoriously difficult, so I would appreciate if some > of the more experienced folks can check this. Notably, I am not sure > whether the smp_wmb() is necessary. > > The documentation for test_and_set_bit() makes the mysterious statement > "This is an atomic fully-ordered operation (implied full memory > barrier)", but the kcsan_mb() seems to be some sort of debugging > barrier, and in any case the docu doesn't make it obvious to me whether > that "full barrier" comes before or after the bit setting takes place. > > Moreover, in my opinion we should order dma_fence_is_signaled(), too – > but if we agree to merge Christian's new series [1] that need should > disappear. > > > [1] > https://lore.kernel.org/dri-devel/[email protected]/ > > --- > drivers/dma-buf/dma-fence.c | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index c7ea1e75d38a..2e80b01499de 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -363,6 +363,18 @@ void dma_fence_signal_timestamp_locked(struct dma_fence > *fence, > &fence->flags))) > return; > > + /* > + * Fully order setting of the bit above with setting of the ops pointer > + * to NULL below, so that all parties can use the signaled flag to > + * detect that the fence decoupled from its ops in a safe manner. > + * > + * The counter parts of this barrier are in dma_fence_timeline_name() > + * and dma_fence_driver_name(). All other future parties that rely on > + * the signaled flag for valid access to the ops pointer will need a > + * memory barrier. > + */ > + smp_wmb(); > + > trace_dma_fence_signaled(fence); > > /* > @@ -1170,6 +1182,12 @@ const char __rcu *dma_fence_driver_name(struct > dma_fence *fence) > > /* RCU protection is required for safe access to returned string */ > ops = rcu_dereference(fence->ops); > + /* > + * Fully order the dereference above with the flag check. Otherwise, > + * ops could be dereferenced as a NULL pointer. The barrier's > + * counterpart is in dma_fence_signal_timestamp_locked(). > + */ > + smp_rmb(); > if (!dma_fence_test_signaled_flag(fence)) Instead of adding the smp_rmb() I think we should check the ops pointer here for consistency with the other cases where we call the ops functions. > return (const char __rcu *)ops->get_driver_name(fence); > else > @@ -1203,6 +1221,12 @@ const char __rcu *dma_fence_timeline_name(struct > dma_fence *fence) > > /* RCU protection is required for safe access to returned string */ > ops = rcu_dereference(fence->ops); > + /* > + * Fully order the dereference above with the flag check. Otherwise, > + * ops could be dereferenced as a NULL pointer. The barrier's > + * counterpart is in dma_fence_signal_timestamp_locked(). > + */ > + smp_rmb(); > if (!dma_fence_test_signaled_flag(fence)) Same of course here as well. Thanks, Christian. > return (const char __rcu *)ops->get_driver_name(fence); > else > > base-commit: cdeb2ccd993ed8647adbbda2c3b103aa717fd6f7
