On Mon, 2026-02-23 at 12:42 +0100, Christian König wrote: > Hi Philip, > > I only found this message by coincident, please make sure to always CC my AMD > work email address as well.
You've been the direct recipent, in the To: header field :) > > On 2/19/26 12:06, Philipp Stanner wrote: > > Yo Christian, > > > > I'd like to discuss the dma_fence fast path optimization > > (ops.is_signaled) again. > > > > As far as I understand by now, the use case is that some drivers will > > never signal fences; but the consumer of the fence actively polls > > whether a fence is signaled or not. > > > > Right? > > Close but not 100% right. The semantic is that enabled_signaling is only > called when somebody actively waits for the dma_fence to finish. > > So as long as both userspace and kernel only poll for the fence status > enable_signaling is never called and only is_signaled is called. So you're telling me that enable_signaling enables interrupt-driven signaling, typically. IOW in some cases you can request that a specific fence gets signaled the expensive way (interrupt) while polling on the others. What is the hw->hw signaling that the documentation details? hw->sw signaling seems to refer to interrupts. > > What drivers/fence implementations do with that is up to them. For example > userqueues use it as preemption signaling, but most drivers simply try to > avoid waking up the system with IRQs. > > > I have a bunch of questions regarding that: > > > > 1. What does the party polling the fence typically look like? I bet > > it's not userspace, is it? Userspace I'd expect to use poll() on > > a FD, thus an underlying driver has to check the fence somehow. > > No no, that is indeed userspace. Userspace has no direct access to a fence. It's, ultimately a kernel ioctl through which userspace can check a fence. That's what I meant: it's kernel code implemented in the driver [but running in the user's process context] > > As soon as the kernel starts to call dma_fence_wait() (for example) we have > the normal guaranteed to signal semantics we always have. > > > 2. What if that party checks the fence, determines it is unsignaled? > > Will it then again try later? > > I have no idea, that depends on how the userspace component is implemented. > > > 3. If it tries again later anyways, then what is the problem with > > the fence-issuing driver itself checking every 5, 10 or 50 > > milliseconds what the counter in the GPU ring buffer is, and then > > signals all those fences? > > That you need to wake up for that, this costs quite a lot of power. > > See two different approach: > > 1. Interrupt driven, e.g. somebody says signal me as soon as possible when > the work is done. > > 2. Poll driven, e.g. userspace wakes up every N milliseconds anyway and it > doesn't matter if the status changes a bit later. Makes sense, I guess. > > > So it circles around the question why ops.is_signaled is supposedly > > unavoidable. > > Additional to the interrupt/poll handling it is also a really important > optimization for multicore systems, e.g. it makes the signaling state visible > to other CPU cores even when the core handling the IRQ is still busy. What is the "signaling state"? A fence's signaled status is indicated through an atomic flag which becomes visible globally once someone, like said interrupt, has signaled the fence. P. > > That is also really important for some use cases as far as I know. Keep in > mind that this framework drivers everything from Android mobiles all the way > up to supercomputers. > > I mean what we could potentially do is to fix the locking invariant of the > is_signaled callback, but that is probably the only simplification possible > without breaking tons of use cases. > > Regards, > Christian. > > > > > Regards > > P. >
