If we process DROP_RESET_ACTIVE and cancel all outstanding requests by
forcing a GPU reset on a hardware with reset capabilities disabled or
not supported, we certainly end up with a terminally wedged GPU,
impossible to recover.  That's probably not what we want.

Before setting the GPU wedged, verify if we have GPU reset available
and fail with -EBUSY if not.

Suggested-by: Petri Latvala <[email protected]>
Signed-off-by: Janusz Krzysztofik <[email protected]>
Cc: Michał Wajdeczko <[email protected]>
Cc: Michał Winiarski <[email protected]>
Cc: Piotr Piórkowski <[email protected]>
Cc: Tomasz Lis <[email protected]>
Cc: Petri Latvala <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Martin Peres <[email protected]>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index fec9fb7cc384..0774ca6e2a05 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3627,8 +3627,17 @@ i915_drop_caches_set(void *data, u64 val)
 
        if (val & DROP_RESET_ACTIVE &&
            wait_for(intel_engines_are_idle(&i915->gt),
-                    I915_IDLE_ENGINES_TIMEOUT))
+                    I915_IDLE_ENGINES_TIMEOUT)) {
+               /*
+                * Only wedge if reset is supported and not disabled, otherwise
+                * we certainly end up with the GPU terminally wedged.  Inform
+                * userspace about the problem instead.
+                */
+               if (!intel_has_gpu_reset(&i915->gt))
+                       return -EBUSY;
+
                intel_gt_set_wedged(&i915->gt);
+       }
 
        /* No need to check and wait for gpu resets, only libdrm auto-restarts
         * on ioctls on -EAGAIN. */
-- 
2.21.0

_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to