On Tue, 2026-06-09 at 12:53 +0200, Christian König wrote:
> >
> > // driver
> > dma_fence_signal(f); // revokes all accesses to our driver through
> > backend_ops
> > // synchronize_rcu() now unnecessary \o/
> > cleanup(f); // We know that all accessors are gone
> > dma_fence_put(f);
>
> Yeah and exactly that doesn't work.
>
> Just think about the Nouveau case when you have your fences on a double
> linked list.
>
> When the fence lock is independent, e.g. have a separate lock for each fence
> then this lock can't protect this double linked list.
>
> So your cleanup path needs to take a lock which protects the list, but you
> then run into lock inversion.
static bool nouveau_fence_is_signaled(struct dma_fence *f)
{
struct nouveau_fence *fence = to_nouveau_fence(f);
struct nouveau_fence_chan *fctx = nouveau_fctx(fence);
struct nouveau_channel *chan;
bool ret = false;
rcu_read_lock();
chan = rcu_dereference(fence->channel);
if (chan)
ret = (int)(fctx->read(chan) - fence->base.seqno) >=
0;
rcu_read_unlock();
return ret;
}
AFAICT fctx->read() does not take f->lock. So where is the lock
inversion?
Again, ideally we can get to the point where no one except for the
fence subsystem itself has to take the lock manually anymore.
>
> > >
> > > So you are left with few options: Either the fence lock is external,
> > > which we don't want because that make the fence non-independent, or
> > > cleanup() defers work to irq_work or work_structs, which creates
> > > numerous lifetime issues.
> >
> > Yup, this is uncool and we want to avoid that.
> >
> > But these seem to be the options
> >
> > 1. Ensure proper synchronization
> > 2. Wait for a grace period in a hot path
> > 3. Defer cleanup() with some delay mechanism
> >
> > #1 is by far the cleanest approach. I still cannot see any downside,
> > and quite a few upsides.
> >
> > https://elixir.bootlin.com/linux/v7.1-rc6/source/drivers/dma-buf/dma-fence.c#L1025
> >
> > ^ is already racing with the signaled check.
>
> Yeah so what? That is just an opportunistic check.
What happens if someone signals the fence while the set_deadline()
callback is running?
P.