In what seems remarkably similar to the w/a required to not reload an
idle context with HEAD==TAIL, it appears we must prevent the HW from
switching to an idle context in ELSP[1], while simultaneously trying to
preempt the HW to run another context and a continuation of the idle
context (which is no longer idle).

We can achieve this by preventing the context from completing while we
reload a new ELSP (by applying ring_set_paused(1) across the whole of
dequeue), except this eventually fails due to a lite-restore into a
waiting semaphore does not generate an ACK. Instead, we try to avoid
making the GPU do anything too challenging and not submit a new ELSP
while the interrupts + CSB events appear to have fallen behind the
completed contexts. We expect it to catch up shortly so we queue another
tasklet execution and hope for the best.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1501
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Mika Kuoppala <[email protected]>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index b12355048501..5f17ece07858 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1915,11 +1915,26 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
         * of trouble.
         */
        active = READ_ONCE(execlists->active);
-       while ((last = *active) && i915_request_completed(last))
-               active++;
 
-       if (last) {
+       /*
+        * In theory we can skip over completed contexts that have not
+        * yet been processed by events (as those events are in flight):
+        *
+        * while ((last = *active) && i915_request_completed(last))
+        *      active++;
+        *
+        * However, the GPU is cannot handle this as it will ultimately
+        * find itself trying to jump back into a context it has just
+        * completed and barf.
+        */
+
+       if ((last = *active)) {
                if (need_preempt(engine, last, rb)) {
+                       if (i915_request_completed(last)) {
+                               tasklet_hi_schedule(&execlists->tasklet);
+                               return;
+                       }
+
                        ENGINE_TRACE(engine,
                                     "preempting last=%llx:%lld, prio=%d, 
hint=%d\n",
                                     last->fence.context,
@@ -1947,6 +1962,11 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
                        last = NULL;
                } else if (need_timeslice(engine, last) &&
                           timer_expired(&engine->execlists.timer)) {
+                       if (i915_request_completed(last)) {
+                               tasklet_hi_schedule(&execlists->tasklet);
+                               return;
+                       }
+
                        ENGINE_TRACE(engine,
                                     "expired last=%llx:%lld, prio=%d, 
hint=%d\n",
                                     last->fence.context,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to