Jose,

I've been experimenting with this too, and was able to get things going
with state being emitted either from the client or the drm, though I'm
still having lockups and things are generally a bit buggy and unstable
still.  To try client side context emits, I basically went back to having
each primitive emit state into the vertex buffer before adding the vertex
data, like the original hack with MMIO.  This works, but may be emmiting
state when it's not necessary.  Now I'm trying state emits in the drm, and
to do that I'm just grabbing a buffer from the freelist and adding it to
the queue before the vertex buffer, so things are in the correct order in
the queue.  The downside of this is that buffer space is wasted, since the
state emit uses a small portion of a buffer, but putting state in a
separate buffer from vertex data allows the proper ordering in the queue.  
Perhaps we could use a private set of smaller buffers for this.  At any
rate, I've done the same for clears and swaps, so I have asynchronous DMA
(minus blits) working with gears at least.  I'm still getting lockups with
anything more complicated and there are still some state problems.  The
good news is that I'm finally seeing an increase in frame rate, so there's
light at the end of the tunnel.

Right now I'm using 1MB (half the buffers) as the high water mark, so
there should always be plenty of available buffers for the drm.  To get
this working, I've used buffer aging rather than interrupts.  What I
realized with interrupts is that there doesn't appear to be an interrupt
that can poll fast enough to keep up, since a VBLANK is tied to the
vertical refresh -- which is relatively infrequent.  I'm thinking that it
might be best to start out without interrupts and to use GUI masters for
blits and then investigate using interrupts, at least for blits.  Anyway,
I have an implementation of the freelist and other queues that's
functional, though it might require some locks here and there.  
I'll try to stabilize things more and send a patch for you to look at.

I've also played around some more with AGP textures.  I have hacked up the
performance boxes client-side with clear ioctls, and this helps to see
what's going on.  I'll try to clean that up so I can commit it.  I've
found some problems with the global LRU and texture aging that I'm trying
to fix as well.  I'll post a more detailed summary of that soon.

BTW, as to your question about multiple clients and state:  I think this 
is handled when acquiring the lock.  If the context stamp on the SAREA 
doesn't match the current context after getting the lock, everything is 
marked as dirty to force the current context to emit all it's state.  
Emitting state to the SAREA is always done while holding the lock.

Regards,

Leif

On Sun, 12 May 2002, Jos� Fonseca wrote:

> As it becomes more clear that in the mach64 the best solution is to fill 
> DMA buffers with the context state and the vertex buffers I've been trying 
> to understand how can this be done and how the Gamma driver (which has 
> this same model) does.
> 
> The context state is available right in the beginning of running a 
> pipeline and usually DDUpdateHWState is called in the beginning of 
> RunPipeline. The problem is that although all state information is 
> available, we don't know which part should be uploaded since other clients 
> could dirty the hardware registers in the meanwhile.
> 
> I'm don't fully understand how the Gamma driver overcomes this. Its 
> behavior regarding this is controled by a macro definition, named 
> DO_VALIDATE, that enables a series of VALIDATE_* macros which in turn I 
> couldn't understand what they do. Another thing that caught my atention 
> was the "HACK" comment on gammaDDUpdateHWState before gammaEmitHwState - 
> it reminds a similar comment on mach64, which makes one think that the 
> author had in mind a better way to do that. Alan, could you shed some 
> light on these two issues please?
> 
> Before I started this little research I already had given some thought on 
> I would do it. One idea that crossed my mind was to reserve some space on 
> the DMA buffer to put the context state before submiting the buffer. Of 
> course that there would be some DMA buffer waste but it wouldn't that much 
> since there are a fairly low number of context registers. One think that 
> holds me back is that I still don't understand how multiple clients avoid 
> each other: what is done in parallel, and what is done in serie...
> 
> I would also appreciate any ideas regarding this. This is surely an issue 
> I would like to discuss further on the next meeting.
> 
> Regards,
> 
> Jos� Fonseca
> 
> _______________________________________________________________
> 
> Have big pipes? SourceForge.net is looking for download mirrors. We supply
> the hardware. You get the recognition. Email Us: [EMAIL PROTECTED]
> _______________________________________________
> Dri-devel mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/dri-devel
> 

-- 
Leif Delgass 
http://www.retinalburn.net



_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [EMAIL PROTECTED]
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to