On Tue, Sep 27, 2011 at 06:31:59PM +0100, Chris Wilson wrote: > On Tue, 27 Sep 2011 09:46:14 -0700, Ben Widawsky <[email protected]> wrote: > > On Tue, 27 Sep 2011 12:03:22 +0200 > > Daniel Vetter <[email protected]> wrote: > > > > > On Mon, Sep 26, 2011 at 10:22:01PM -0700, Ben Widawsky wrote: > > > > On Mon, 26 Sep 2011 19:59:50 +0200 > > > > Daniel Vetter <[email protected]> wrote: > > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > > > > b/drivers/gpu/drm/i915/i915_irq.c index da5d607..09c11e4 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_irq.c > > > > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > > > > @@ -1694,7 +1694,7 @@ void i915_hangcheck_elapsed(unsigned long data) > > > > > if (dev_priv->hangcheck_count++ > 1) { > > > > > DRM_ERROR("Hangcheck timer elapsed... GPU > > > > > hung\n"); > > > > > - if (!IS_GEN2(dev)) { > > > > > + if (!IS_GEN2(dev) && i915_try_reset) { > > > > > /* Is the chip hanging on a > > > > > WAIT_FOR_EVENT? > > > > > * If so we can simply poke the > > > > > RB_WAIT bit > > > > > * and break the hang. This should > > > > > work on > > > > > > > > I think you should also be able to accomplish the same thing > > > > with enable_hangcheck param. I had the same problem with the > > > > debugger :) > > > > > > I agree. Iirc you have some patches floating in that area to make the > > > hangcheck a bit more robust. Can you maybe add this to that series and > > > (re-)submit? > > > > > > Cheers, Daniel > > > > While 9/10 times daniel > ben, I'm playing my 10% card here and > > suggesting that mixing the reset variable and ring kick is not the right > > way to go about this. > > One purpose of the i915.reset parameter is to disable any automatic > attempts to recover from a hang condition so that the error state is not > misleading. So preventing the kick ring does help in that regard. > > A second purpose is to prevent i915_reset() from causing havoc and hanging > the machine. Daniel is implying that kicking the rings is instrumental in > making matters worse. Again using i915.reset to prevent kicking the rings > fits in with that purpose. > > Since I regard kicking rings as a form of reset, I don't see it as a > conflation of terms and so a valid use of i915.reset.
Couldn't have said it any better. The bad effects of kicking stuck rings is mostly that when we have a sync problem there's a decent chance somebody has written garbage into our batchbuffers. Continously trying to execute said garbage is just tempting faith in the gpu's error resilience. -Daniel -- Daniel Vetter Mail: [email protected] Mobile: +41 (0)79 365 57 48 _______________________________________________ Intel-gfx mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/intel-gfx
