[Dri-devel] Re: NO_BATCH_DISPATCH path

Leif Delgass Wed, 05 Jun 2002 13:09:01 -0700

On Wed, 5 Jun 2002, Jos� Fonseca wrote:

> On 2002.06.05 20:34 Leif Delgass wrote:
> > On Wed, 5 Jun 2002, Jos� Fonseca wrote:
> > > ...
> > >
> > > In _wait_ring you don't check if card is idle (and resume) so you could
> > > be forever waiting for free space space.
> > 
> > I do see the problem there, I was thinking that this couldn't be
> > triggered
> > with the current code because wait_ring would never be entered (can't
> > fill
> > the ring).  I realized that ring.space is only kept up to date with
> > regard
> > to the ring head by wait_ring, so it is possible.  I went back and added
> > a
> > flush to the ring space macro in my version and it seemed to fix the
> > lockup I was having with the q3demo, although I haven't tested long
> > enough
> > to see if there aren't other lockups.
> > 
> > > In do_dma_idle you just called _wait_for_idle, but you didn't account
> > > for the possibility of the card hadn't finished processing.
> > 
> > The reason for this was that I have a separate ioctl for flushing DMA.
> > DMA can be idle without the queue being empty, if the queue is empty then
> > the card is quiescent.  Most places where there is a wait for idle, it's
> > preceeded by a flush, e.g. in Read/WritePixels, the XAA sync, etc.  The
> > question is:  are there still places where we call the idle ioctl when
> > quiescense is _not_ required (there is no preceeding flush)?  If not, we
> > wouldn't need the flush ioctl (assuming we never need to flush from
> > outside the drm without waiting for idle) and your version of do_dma_idle
> > is fine since it flushes before waiting for idle.
> 
> mmm.. I see. If we can avoid flushing all the DMA buffers for XAA the 
> better. That will make X more responsive.


The problem is that when the X server gets the lock, the window coords and
cliprects can change when we get the lock back.  That leaves data in the
queue with old clipping info if we don't flush in the XAA sync.  It was 
because of that that I _added_ a flush in the XAA sync.  I have to look 
through the code to see if there is still a place where we don't need to 
flush.
 
> > 
> > > These are just two I think of. I don't recall if there are more. I
> > dealt
> > > with the problem by making the check for card idle but with buffers to
> > > process in UPDATE_HEAD.
> > 
> > I see why you did this, but I think it obfuscates the code a bit to 
> > flush inside the UPDATE_RING_HEAD macro.  Maybe we could rename it to
> > UPDATE_HEAD_WITH_FLUSH or something to make it more explicit.  Also, I
> 
> But is there any place where UPDATE_RING_HEAD shouldn't resume the 
> operation?

Maybe not, but I was a bit confused when I first looked at the code until 
I realized that the DMA was actually being started there.

> Later on I do plan take some of that code out of the macro and put it an 
> subroutine to make a little more analysis on the number of remaining 
> entries to decide whether it's worth (or _necessary_ !) to flush them or 
> wait a little more to buffer a number of entries which allows for a more 
> smooth operation.
> 
> > noticed that you removed the bounds check on BM_GUI_TABLE.  Without that,
> 
> I confess that I didn't noticed that.
> 
> > you need to be certain that the card is not locked up when starting a new
> > pass, because the ring_head could be an invalid address.  I guess if the
> > card is locked, GUI_STAT will still read as active, so maybe the check
> > isn't necessary, but we should make sure that the new path can handle a
> > lockup.  One of the reasons I wanted to keep the old path in the code for
> > a while was to make sure there isn't a problem with using BM_GUI_TABLE
> > the way we are that we haven't encountered or tracked down yet.
> > 
> 
> Ok. After commiting this set of changes I'll do as above, and put this 
> check too and that subroutine I mentioned.
> 
> > ...
> > >
> > > As I said in my previous mail (I hadn't received this one yet), it's
> > not
> > > on my priority list. I prefer have the DMA work properly and without
> > > faults, than losing time maintaining more than one code path. Having
> > the
> > > boths paths in runtime is gonna be difficult. Perhaps better is leave
> > > the #ifdef's and in the end just make the pseudo-dma code play with the
> > > rest and whoever want's use that path changes a macro and recompiles
> > the
> > > DRM.
> > 
> > I understand, and I know it makes the code harder to read.  I think there
> > are still problems with DMA on ppc, but I haven't heard from Peter for a
> > while.  I was thinking of pseudo-DMA as a way to debug DMA more than a
> > path that would be for everyday use.  I've used it for this purpose and
> > it's been helpful to have, but we can remove it once things are known to
> > be stable.
> 
> The idea I had was that the problem was the other way around, i.e., DMA 
> worked, but for PIO it was necessary to do wait_for_idle, instead of 
> wait_for_idle.

I remember the wait_for_idle/wait_for_fifo problem, but I thought the DMA 
path never really worked right for Peter.  I'll have to look back at the 
posts.

> Mentioning PPC, we have to consider merging the trunk back in the mach64 
> branch again to receive all fixes there (such as PPC, garbage when closing 
> windows, and a couple more I don't remember). I just don't know when is 
> the right time, as this is rather painfull and time consuming to do...

Maybe when Mesa 4.0.3 is released? (I think this will be soon).  We'll 
also have to deal with the drm-command changes when we merge.

> > ...
> > 
> > In general I think the changes look good.  Here's a couple of minor fixes
> > 
> > for your patch:
> > 
> > - dma_start needs a prototype in mach64_drv.h since it's referenced in
> > the macros, which are used in mach64_state.c.
> > 
> 
> Yep, indeed.
> 
> > - I think you have an endianess problem with ADVANCE_RING.  The command
> > read from the ring will by cpu endianess, and then you're or/and-ing with
> > a little endian value and writing back with the |= and &=.  I think you
> > need to move the read inside the cpu_to_le32 and use a simple assignment:
> > 
> > ring[(write-2) & mask] = cpu_to_le32( ring[(write-2) & mask] | DMA_EOL );
> > 
> > maybe use a temp var for the dword offset?
> > 
> 
> I disagree. The correct way would be
> 
>    ring[(write-2) & mask] = cpu_to_le32( le32_to_cpu(ring[(write-2) & 
> mask]) | DMA_EOL );
> 
> but since the cpu_to_le32/le32_to_cpu macros are basically shifts, they 
> are distributive so that gets into
> 
>    ring[(write-2) & mask] =  ring[(write-2) & mask] | cpu_to_le32(DMA_EOL 
> );
> 
> or simply
> 
>    ring[(write-2) & mask] |= cpu_to_le32(DMA_EOL );
> 

You're right, I forgot that we've already swapped the data in the ring.

-- 
Leif Delgass 
http://www.retinalburn.net


_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

[Dri-devel] Re: NO_BATCH_DISPATCH path

Reply via email to