This patch series includes previously attempted patches to fix panthor scheduler issues with spurious timeouts and issues when a termination failed which would lead to a race condition.
Timeout recovery has been tested with some IGT tests issuing jobs with infinite loops [1]. It's certainly not enough to claim that everything works as it should, but that's still more testing than we had so far ;-). [1]https://gitlab.freedesktop.org/bbrezillon/igt-gpu-tools/-/commit/15c3ee220808a437a76638bd21fedfb4498a434f Changes in v8: - Don't touch drm_gpu_scheduler::timeout Changes in v7: - Add Steve's R-b - Use the local group variable when we can Changes in v6: - Re-order changes - Dropped the Fixes tag on one patch - Cover UAF situation when the timeout work is pending/running at group destruction time Changes in v5: - Swiched to a patch series to make sure the patch which addresses the bug is added as a requirement on the scheduler patch. Changes in v4: - Moved code related to a timeout bug to a separate patch as this was not relevant to this change. Changes in v3: - Moved to a patch series to make sure this bug fix happens before the changes to the scheduler Changes in v2: - Fixed syntax error Ashley Smith (2): drm/panthor: Make the timeout per-queue instead of per-job drm/panthor: Reset queue slots if termination fails drivers/gpu/drm/panthor/panthor_sched.c | 295 +++++++++++++++++------- 1 file changed, 213 insertions(+), 82 deletions(-) -- 2.51.1
