On 31/07/2025 13:48, Karunika Choo wrote: > On 31/07/2025 11:57, Steven Price wrote: >> On 30/07/2025 18:43, Karunika Choo wrote: >>> In certain scenarios, it is possible for multiple cache flushes to be >>> requested before the previous one completes. This patch introduces the >>> cache_flush_lock mutex to serialize these operations and ensure that >>> any requested cache flushes are completed instead of dropped. >>> >>> Signed-off-by: Karunika Choo <[email protected]> >>> Co-developed-by: Dennis Tsiang <[email protected]> >> >> A Co-Developed-By needs to have a signed-off-by too[1] > > Oops. I can push a v2 to add those. > >> >> [1] >> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by >> >> But I also don't understand how this is happening. The only caller to >> panthor_gpu_flush_caches() is in panthor_sched_suspend() and that is >> holding the sched->lock mutex. > > The fix is in relation to the enablement of GPU Flush caches by default > for all GPUs [1]. While calls from the MMU are serialized, other calls > i.e. from panthor_sched_suspend() are not. As such, this patch > explicitly serializes these operations.
Ah, ok so this is effectively a bug fix for that patch - given we've not yet merged that series can we just do a v9 of the series with the fix rolled in? (Rather than having a commit or two where we know the bug is present). I have to admit it also feels like we should have something to avoid doing excessive cache flushes - there's no point in queuing up multiple flushes back-to-back. But I don't have a neat solution, and I'm not sure whether this will happen often enough to worry about. So I guess we should probably ignore it until/unless it becomes a problem. Steve > [1] > https://lore.kernel.org/all/[email protected]/ > > Kind regards, > Karunika Choo > >> Steve >> >>> --- >>> drivers/gpu/drm/panthor/panthor_gpu.c | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c >>> b/drivers/gpu/drm/panthor/panthor_gpu.c >>> index cb7a335e07d7..030409371037 100644 >>> --- a/drivers/gpu/drm/panthor/panthor_gpu.c >>> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c >>> @@ -35,6 +35,9 @@ struct panthor_gpu { >>> >>> /** @reqs_acked: GPU request wait queue. */ >>> wait_queue_head_t reqs_acked; >>> + >>> + /** @cache_flush_lock: Lock to serialize cache flushes */ >>> + struct mutex cache_flush_lock; >>> }; >>> >>> /** >>> @@ -204,6 +207,7 @@ int panthor_gpu_init(struct panthor_device *ptdev) >>> >>> spin_lock_init(&gpu->reqs_lock); >>> init_waitqueue_head(&gpu->reqs_acked); >>> + mutex_init(&gpu->cache_flush_lock); >>> ptdev->gpu = gpu; >>> panthor_gpu_init_info(ptdev); >>> >>> @@ -353,6 +357,9 @@ int panthor_gpu_flush_caches(struct panthor_device >>> *ptdev, >>> bool timedout = false; >>> unsigned long flags; >>> >>> + /* Serialize cache flush operations. */ >>> + guard(mutex)(&ptdev->gpu->cache_flush_lock); >>> + >>> spin_lock_irqsave(&ptdev->gpu->reqs_lock, flags); >>> if (!drm_WARN_ON(&ptdev->base, >>> ptdev->gpu->pending_reqs & >>> GPU_IRQ_CLEAN_CACHES_COMPLETED)) { >> >
