On Thu, Jul 28, 2016 at 04:05:28PM +0100, Dave Gordon wrote:
> On 27/07/16 13:29, Chris Wilson wrote:
> >On Wed, Jul 27, 2016 at 12:53:25PM +0100, Dave Gordon wrote:
> >>On 25/07/16 08:44, Chris Wilson wrote:
> >>>If we rewrite the I915_WRITE_TAIL specialisation for the legacy
> >>>ringbuffer as submitting the request onto the ringbuffer, we can unify
> >>>the vfunc with both execlists and GuC in the next patch.
> >>>
> >>>Signed-off-by: Chris Wilson <[email protected]>
> >>>Reviewed-by: Joonas Lahtinen <[email protected]>
> >>>---
> >>>drivers/gpu/drm/i915/i915_gem_request.c |  8 ++---
> >>>drivers/gpu/drm/i915/intel_lrc.c        |  2 +-
> >>>drivers/gpu/drm/i915/intel_ringbuffer.c | 53 
> >>>+++++++++++++++++----------------
> >>>drivers/gpu/drm/i915/intel_ringbuffer.h |  3 +-
> >>>4 files changed, 32 insertions(+), 34 deletions(-)
> >>>
> >>>diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
> >>>b/drivers/gpu/drm/i915/i915_gem_request.c
> >>>index 1c185e293bf0..8814e9c5266b 100644
> >>>--- a/drivers/gpu/drm/i915/i915_gem_request.c
> >>>+++ b/drivers/gpu/drm/i915/i915_gem_request.c
> >>>@@ -467,15 +467,13 @@ void __i915_add_request(struct drm_i915_gem_request 
> >>>*request,
> >>>    */
> >>>   request->postfix = ring->tail;
> >>>
> >>>-  if (i915.enable_execlists) {
> >>>+  if (i915.enable_execlists)
> >>>           ret = engine->emit_request(request);
> >>>-  } else {
> >>>+  else
> >>>           ret = engine->add_request(request);
> >>>-
> >>>-          request->tail = ring->tail;
> >>>-  }
> >>>   /* Not allowed to fail! */
> >>>   WARN(ret, "emit|add_request failed: %d!\n", ret);
> >>>+
> >>>   /* Sanity check that the reserved size was large enough. */
> >>>   ret = ring->tail - request_start;
> >>>   if (ret < 0)
> >>>diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> >>>b/drivers/gpu/drm/i915/intel_lrc.c
> >>>index 567d94de3300..250edb2bcef7 100644
> >>>--- a/drivers/gpu/drm/i915/intel_lrc.c
> >>>+++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>>@@ -373,7 +373,7 @@ static void execlists_update_context(struct 
> >>>drm_i915_gem_request *rq)
> >>>   struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
> >>>   uint32_t *reg_state = rq->ctx->engine[engine->id].lrc_reg_state;
> >>>
> >>>-  reg_state[CTX_RING_TAIL+1] = rq->tail;
> >>>+  reg_state[CTX_RING_TAIL+1] = rq->tail % (rq->ring->size - 1);
> >>
> >>mod ringsize-1 ?
> >>
> >>Surely tail % ringsize, or tail & (ringsize-1).
> >>
> >>But it's redundant anyway, rq->tail cannot exceed ring->size,
> >>so the original code was correct.
> >
> >No, rq->tail can be equal to ring->size which leads to a GPU hang.
> >(Observed on the older gen at least, I'd rather have the same paranoia
> >here.)
> >-Chris
> 
> Even if it's not redundant, it's still the wrong number. The code
> above would result in tail (==size) being converted to 1 rather than
> 0.
> 
> If it's a % operation, it should be ringsize not ringsize-1. Or
> convert to an & operation with ringsize-1.

It was just meant to be & (ring->size-1).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to