On Mon, 10 Jun 2002, Jos� Fonseca wrote: > On 2002.06.10 00:53 Leif Delgass wrote: > > On Sun, 9 Jun 2002, Jos� Fonseca wrote: > > > > > Leif, > > > > > > I have had some problems with the DMA code (not just now but since the > > > beginning of the week). Sometimes get lockups and some crashes quite > > > easily, others I get bored of jumping around in UT waiting for > > something > > > bad to happen. > > > > Anything reproducible, or is it at different places every time? > > There isn't a true pattern. Sometimes it happens when I click on the mouse > or keyboard, others it happens by itself, and others it simply doesn't > ever happen. I still have no clue. > > > > > > But finally I got something - please look at the attached dump - it > > shows > > > the ring contents. There are two things strange in that picture: > > > > > > 1- In the BM_COMMAND value specified in the table head is 0x400001a0 > > > while the register says 0x40000038 > > > > I think this is because the card is decrementing the byte count as it > > processes the buffer. You'll notice that BM_SYSTEM_MEM_ADDR also > > looks like it's being incremented as the buffer is processed. I don't > > think this indicates a problem. The register state looks like the engine > > locked while processing the buffer at 0x00534000. > > But if so should they be incremented/decremented by the same amount?
Maybe, but without knowing about the hardware implementation it's hard to know. Maybe the system address is only incremented in 256 byte chunks? I remember seeing BM_COMMAND seemingly being decremented by the hardware in one of your earlier tests (where there was no lockup), so that looked like normal operation to me. A quick test could be made to repeatedly sample these registers while a full DMA buffer is being processed. Of course, there will always be delay between reads, so it's hard to get a true snapshot of the state of multiple registers at a given moment in time. > > ... > > > > > > The feeling I get from this is that we are writing to pending buffers > > > causing the engine to lock. But I don't see how... This only happened > > in > > > my testbox (which is rather slow). Have you experienced anything like > > this? > > > > I haven't had any lockups since your changes, apart from once when I was > > working on the BM_HOSTDATA blit changes, and I haven't had any since I > > committed my changes. I've done quite a bit of "testing" with various > > games lately, too. ;) But, as you say, there could be bugs that only > > This is good to know. btw, when did you last update from cvs? I was wondering if you had tried the blit changes I made yesterday (June 8). > > crop > > up on a slower system (I test on a Dell Inspiron 7k laptop w/ PII 400 and > > a Rage LT Pro 8M AGP 2x). I should mention that I have _EXTRA_CHECKING > > disabled. I suppose it's possible that the checking code causes a > > problem, but I'm assuming that you had lockups before adding it. > > > > Yes, the _EXTRA_CHECKING was already part of my attempt to hunt this > down... That's what I figured. > > One thing to be aware of is that do_release_used_buffers unconditionally > > frees all pending buffers, so it should only be called when the card is > > known to be idle and finished with all pending buffers on the ring. But > > I > > think the current code is safe in that respect. Maybe it would help in > > debugging if you also dump the contents of the buffer pointed to by the > > head (and maybe the tail if you think there might be a problem there). > > Ok. Tomorrow I'll add a extra check there. I'll also add an wait_for_idle > in ADVANCE_RING to force a sync with the engine to see if this still > happens. > > I just wanted to be sure that there isn't a race condition which we aren't > aware of... because if there is one, it could give more headaches > afterwards than now. Sure. We should definitely try to find the source of your lockups before going much further. > Anyway I'm getting a little bored with this, so I'm already moving forward > too: I'm gonna have the DMA* macros to hold a a DMA buffer across > successive calls instead of wasting a buffer for each call. > > > > > BTW, did anything come before the ring dump in the log? Do you know > > where it was triggered from? > > No, this time there wasn't nothing more. The trigger is hard to determine: > they problem is noticed only when waiting for buffers, so sometimes there > are some messages related to timeouts. Yeah, a timeout there is usually just a symptom of a lockup, lots of things could potentially be the root cause. -- Leif Delgass http://www.retinalburn.net _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas - http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink _______________________________________________ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
