On 05/12/2025 12:46, Boris Brezillon wrote:
On Thu, 4 Dec 2025 09:42:37 -0800
Chia-I Wu <[email protected]> wrote:
On Thu, Dec 4, 2025 at 1:27 AM Tvrtko Ursulin <[email protected]> wrote:
On 04/12/2025 01:50, Chia-I Wu wrote:
Commit 506aa8b02a8d6 ("dma-fence: Add safe access helpers and document
the rules") details the dma-fence safe access rules. The most common
culprit is that drm_sched_fence_get_timeline_name may race with
group_free_queue.
Fixes: d2624d90a0b77 ("drm/panthor: assign unique names to queues")
Signed-off-by: Chia-I Wu <[email protected]>
---
drivers/gpu/drm/panthor/panthor_sched.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c
b/drivers/gpu/drm/panthor/panthor_sched.c
index 33b9ef537e359..a8b1347e4da71 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -23,6 +23,7 @@
#include <linux/module.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
+#include <linux/rcupdate.h>
#include "panthor_devfreq.h"
#include "panthor_device.h"
@@ -923,6 +924,9 @@ static void group_release_work(struct work_struct *work)
release_work);
u32 i;
+ /* dma-fences may still be accessing group->queues under rcu lock. */
+ synchronize_rcu();
+
for (i = 0; i < group->queue_count; i++)
group_free_queue(group, group->queues[i]);
This handles the shared queue->fence_ctx.lock as well (which is also
unsafe until Christian lands the inline lock, etc patch series) so it
looks good to me as well.
Yeah, I will send v2 to drop the misleading "Fixes:" tag.
FWIW, the UAF I saw was from accessing the string returned by
static const char *drm_sched_fence_get_timeline_name(struct dma_fence *f)
{
struct drm_sched_fence *fence = to_drm_sched_fence(f);
return (const char *)fence->sched->name;
}
IIRC, the only place calling this callback is some debugsfs knob
dumping fences attached to dma_buf resvs, and we're not supposed to
expose our driver fences to the outside world (we use the
drm_sched_fence proxy for that), so I'm curious where the access was
coming from.
Via the sync_file uapi.
For reference here is a reproducer for xe:
https://patchwork.freedesktop.org/patch/642709/?series=146211&rev=2
Regards,
Tvrtko
I thought it was "name" and added the "Fixes:" tag. But actually
"sched" was also freed by group_release_work.
Just to mention an alternative could be to simply switch release_work to
INIT_RCU_WORK/queue_rcu_work, but I am not sure if that has an advantage.
Regards,
Tvrtko